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GENOTYHNG BY SIMULTANEOUS ANALYSIS 
OF MULTIPLE MICROSATELUTE LOQ 

The work leading to this invention was supported in part by Grant No. GM 47145 from 
the National Institutes of Health. The United States Government may retain certain rights in this 
invention. 

BACKGROUND OF THE INVENTION 
Field of the Invention 

This invention is directed to semi-automated methods for linkage mapping of the genome * 
by genotyping of multiple microsatdlite lod. 
Summary of Background Information 

For most genetic disorders, there is no known biochemical defect. Consequently, flie 
mutant genes associated with the disease and their disease-fusing abnormal gene products are 
recognized solely by the anomalous phenotype they produce. Identifying the chromosomal 
localization for the gene(s) that produce these disease phenotypes is often the first crucial stq) 
toward isolation and characterization of the mutation(s) by recombinant DNA techniques. 
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The significance of mapping a gene is perhaps better appreciated when put into context 
with the human genome project. Consider for a moment that even after every base of the DNA 
in the entire human genome has been sequenced through the Human Genome Initiative (HGI), 
and every gene has been localized in this sequence, it may still not be clear which disorder(s) 
arise itom which gene(s). Each disease phenotype will still need to be "mapped" or associated 
with a particular location in the genome. This is usually carried out by analyzing DNA isolated 
from blood specimens collected firom individuals within families affected by a genetic disorder. 
Once a disorder or abnormal phenotype has been linked to a particular region on a chromosome, 
the limited number of genes within this area will permit us to suggest a candidate gene that can 
contribute to the phenotype. Thus, once the localization of a major disease phenotype to a 
chromosomal region is confirmed, a few candidate genes can be examined for mutations as well 
as potential pathogenic mechanisms. 

If no genes have been mapped to the region, then linkage studies with closely- spaced 
surrounding markers can often be used to delineate a large chromosomal interval (1-2 Mb) in 
which to Search for transcribed sequences. This approach (originally termed "reverse genetics") 
is now generally referred to as "positional cloning". In the past the isolation of candidate genes 
from these large genomic regions was the rate-limiting step in positional cloning, requiring years 
of intensive work. However, recent improvements in methods to capture expressed sequences 
encoded within large genomic segments have been described. Thus, there is now a need for 
advances in the molecular genetic methods employed in the linkage mapping of disease gmes. 
Linkage 

The chromosomes are the basic units of inheritance on which genes and DNA markers 
are organized in a linear fashion (see Figure 1). Linkage is evident when a gene(s) that 
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produces a phenotypic trait, or a significant portion of the trait, and the surrounding DNA 
markers are inherited together (cosegregate at meiosis). In contrast, those markers that are not 
associated with the anomalous phenotype of interest will be randomly distributed among affected 
family members as a result of the independent assortment of chromosomes and crossing over 
during meiosis (see Figure 2, compare "A" markra to "B"-"F" markers). 

In general, the further a marker, or gene, is firom the genetic locus of interest (for 
example, markers 1 and 4 as compared to markers 2 and 3 in Figure 1), the more likely diey 
will be separated by crossing over at meiosis. The recombinant genotypes produced by crossing 
over between maternal and paternal chromosomes at meiosis allows us to predict the ordering 
of genes and markers through the interval under examination. Recombination between the 
markers lA and 3A, and 2A and 4A in the affected members in Figure 2, suggest that the 
mutant gene of interest lies between markers 1 and 4. Thus linkage to a marker of known 
chromosomal location allows placement of the phenotype on the chromosomal map. 

Analysis for testing linkage widi use of DNA markers is based on standard likelihood 
theory. The DNA markers are used to recognize each of the parental chromosomes. Recall that 
in general each chromosome is inherited independently of any other; and the likelihood of 
inheriting either chromosome of a pair from each parent is 50:50. Therefore, when a marker 
is unlinked to the gene(s) producing an anomalous ph«iotype, one expects both the maternal and 
paternal chromosomes to be equally distributed in the affected offspring. 

linkage in the human is established by the method of likelihood ratios (see Ott, 1992 
"Analysis of Human Genetic Linkage," The Johns Hopkins University Press, Baltimore, for a 
review). One compares the probability that observed family data, such as that in Figure 2, 
would arise under one hypothesis (for instance, linkage with no recombination with marker 2 
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or 3) to the probability that it would arise under an alternative hypothesis (typically, nonlinkage). 
The ratio of these probabilities is called the odds ratio for one hypothesis relative to the other. 
By convention, mammalian genetidsts prefer the log of the odds ratio, or the lod score. 
Generally, linkage is considered proven when the odds in favor of linkage versus nonlinkage 
5 become overwhelming, or reach 1000:1 (LOD = 3) (see Morton, 1955, Am. J. Hum. Genet, y 
2:277-318). Linkage is rejected when the odds drop to 100: 1 against this hypothesis (LOD = - 
2). The maximum likelihood estimate is the recombination fraction where the likelihood ratio 
is largest. Lod scores from multiple pedigrees are thus added until the score grows to 3 
(signifying 1000:1 odds) or falls to -2 (indicating 1:100 odds). Linkage can be easily evaluated 
10 using likelihood ratios, even in complicated pedigrees, by testing on the coniputer for these 
competing hypothesis. RecenUy, additional strategies have been devised that can handle genetic 
heterogeneity more effectively (Oh, 1974, Am. J. Hum. Genet. , 2^:588-597) as well as disorders 
caused by multiple genes (Lander, et al,, 1986, Proc. Natl. Acad. ScL USA, 52:7353-7357). 

15 The descriptions of many types of DNA sequence polymorphisms have provided the 

fundamental basis for our understanding of the structure of the mammalian genome (CEPH 
consortium map, 1992, Science, 252:67-86; Weissenbach et al., 1992, Nature, 252:794). The 
construction of extrasive framework linkage maps has been greatiy facilitated by the use of these 
DNA polymorphisms, and has provided a practical means for the localization of disease genes 

20 by linkage. The process of linkage ms^ing in Mendeiian and complex disorders using these 

techniques has been further facilitated by the recent description of a detailed "second-generation" 

linkage map of die human genome (Wdssenbach et ai., 1992). In particular the recmt 

description of highly polymorphic PCR-based microsatellite markers for genotyping has greatly 
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advanced the construction of high resolution linkage maps (Weber and May, 1989, Am. J. Hum. 
Genet., 44:388-396; litt and Luty, 1989, Am. J. Hum. Genet., 44:397-401). 

The midosatellite markers are highly polymorphic, simple sequence repeat (SSR) 
markers, generally defined as repeats of 6 bp or less running in tandem for up to 100 bp long 
5 OBeckmann, et al., 1992, Genomics, 12:627-631). These rq)eat sequences are flanked by unique 
DNA sequences that may be identified for each marker location. With primers that correspond 
to the unique DNA sequence surrounding each marker, the polymerase chain reaction (PCR, see, 
e.g., Saiki, et al., 1988, Science^ 222:489) can be used to detect each polymorphism. 

This type of genetic marker is abundant and found throughout the genome. SSR may be 

10 as frequent as one every 6 kb (Beckmann, et al. , 1992). Where SSR marters show considerable 
polymorphism (differences in the number of repeats) between individuals, the markers can be 
particularly informative. Many such SSR markers have been isolated throughout the genome, 
and are well mapped (Weissenbach, et al., 1992). Many of these SSR markers are now 
available commercially for linkage studies (e.g., from Research Genetics, Huntsville, AL). 

IS Those markers which frequently allow the investigator to identify each parental chromosome as 
unique and to identify each crossover rapidly (see Figure 2) approach the ideal for linkage 
studies. 

Most SSR are (GT). dinucleodde repeat length polymorphisms (see Figure 3). It is 
estimated that there are about 100,000 of the (Gl\ type SSR, or one approximately every 30 
- 20 kb (Beckmann, et al, 1992). Over 1,000 SSR markers have been described to date in the 
Genome Data Base, October 19, 1993, The Johns Hopkins University, Baltimore, Maryland, 
and thousands of additional markers are now in development. 
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It is now well accepted that methods based on the polymerase chain reaction (PCR) and 
highly polymorphic simple sequence repeat (SSR) markers (e.g. Figure 3) are the techniques of ^ 
choice for genotyping in linkage studies (Weber, et al., 1989; Litt, et al., 1989; Edwards, et al. 
1991, Am J. Hum. Genet, ^.746-56). PCR-based methods are faster and therefore less cosdy 
S than restriction fragment length polymorphism (RFLP) methods; moreover, they do not require 
nucleic acid probes, and are more informative in linkage studies. Efforts are underway to 
develop automated techniques for genotyping that will further improve the efficiency of linkage 
studies utilizing this type of nucrosatellite markers polymorphism. The advantages of analyzing 
multiple polymorphic lod using an automated DNA sequencer were first described by Skolnick 

10 and Wallace in 1988 (Genomics, 2:273-279). Building on techniques reported by Connell,.et 
al. (1987, Biotechniques, 5:342-348), Ziegleet al., (1992 Genomics, 14:1026-1031), extended 
this approach to incorporate automated DNA sizing technology for genotyping microsatellite lod 
using four color fluorescence-based techniques. 

However, the analysis of microsatellite markers still relies on gel electrophoresis which 

IS has limited sample handling capacity. Furthermore, the gel electrophoresis of DNA fragments 
is complicated by problems with gel distortion, such as band shifting that warrant internal size 
standards and bandmatching software (Lander, 1991, Am J. Hum. Genet, 4^:819-823). 
Crosstalk or interference during analysis between multiple dyes with spectral overlap is another 
potential problem when multiple PCR fragments of the same size are to be identified within the 

20 same gel lane. Since the processing of gels and the scoring of autoradiographs remains the 
rate-limiting step in genotyping, methods are being sought that improve the efficiency of sample 
handling while minimizing errors in data transcription and analysis. 
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The challenge of mapping the major genes in complex disorders requires efficient and 
highly accurate methods of genotyping. Recent technological enhancements in molecular 
genetics have significantly improved our ability to locate disease genes by linkage analysis. 
However, despite the introduction of molecular methods, such as PGR, and the discovery of 
highly polymorphic SSR, genotyping is still rate-limiting for localizing disease genes by linkage. 
The present methods remain highly technical, time-consuming, and expensive. 
SUMMARY OF THE INVENTION 

It is an object of this invention to provide a robust semi-automated protocol for 
genotyping using multiplex analysis of many microsatellite loci while maintaining, or improving, . 
Qfping accuracy as compared to traditional methods. It is also an object of this invmtion 
to provide a collection of highly reproducible miCTOsatellite markers at approximately 10-50 cM 
intervals throughout the human genome which can be detectably-labelled. 

It is a further object to provide protocols for the reliable use of these marker systems in 
automated graotyping. 

To meet these and other objects, and to better exploit the inherent advantages of 
fluorescence-based genotyping techniques, this invention provides highly informative SSR 
markers, assembled into "SETS" that do not overlap in size when separated electiophoretically 
on an acrylamide gel and that can be labelled with different fiuorophores. Each SET contains 
6 or more pairs of primers that provide for amplification of markers (preferably 7-8 pairs of 
primers) that have been labelled with the same fluorophorie having a distinct color, separate 
SETs having different fluorophore labels (e.g., blue, green, or yellow). PCR products 
corresponding to these SETS are combined into a GROUP for electrophoretic analysis in a single 
lane. Using this methodology, a GROUP of 18 or more, preferably 21 to 24 dinucleodde 
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markers can be electrophoresed along with an internal size standard and analyzed simultaneously 
(multiplexing) in real-time for each individual studied. 

In particular, the invention provides a Idt for use in automated genotyping within a 
population comprising four or more GROUPS, each GROUP containing at least three SETS, and 
eadi SET in tiim comprising at least 6 labdled pairs of primers for amplification of DNA by 
polymerase chain reaction (PGR), the sequence of each primer pair corresponding to a portion 
of the unique genomic sequence of a microsatellite sequence (which is made up of a nucleotide 
repeat sequence flanked by unique seqxiences), the nucleotide repeat sequence being polymorphic 
within the population. Amplification of DNA from a human sample by the polymerase chain 
reaction (PGR) primed with a particular primer pair amplifies the nucleotide repeat sequence and 
at least some of the immediately adjacent unique sequences of the microsatdlite sequence to 
produce a PGR product identified with the primer pair. The distance in the genome between the 
microsatellite sequence amplified by one primer pair of the kit and the nearest other 
microsatellite sequence amplified by another primer pair of the kit is at least 2 centimorgans 
(cM) and no more than 50 cM. Each SET consists of at least 6 of the primer pairs, where tiie 
length of the segment amplified by a particular primer pair (its PGR product) differs from the 
length of PGR products from all other primer pairs in die SET by at least 5 nucleotides for 
tetranucleotide repeats, at least 6 nucleotides for trinucleotide rq)eats and at least 9 nucleotides 
for dinucleotide repeats. At least one primer of each primer pair is labelled with a fluorescent 
label that is the same for all primer pairs in the SEP. Each GROUrp consists of at least three 
SETS of primer pairs labelled with fluorescent labels, and primers from one SET in the GROUP 
are labelled with a fluorescent label which fluoresces at a wavelengtii which is substantially 
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different from the wavelength at which the fluorescent labels on the primers in each of the other 
SETS in the GROUP fluoresce. 

Where the primers in a single Idt cover the entire genome with markers spaced 
approximately 10 cM apart in the genome^ the kit will usually contain at least about 10 
S GROUPS. In another embodiment, a kit is provided for screening of the genome with individual 
markers spaced in the genome about 50 cM from the nearest other marker in the kit, and the kit 
contains at least 4 GROUPS. The invention also provides kits containing fewer GROUPS with 
primers whose PGR products identify microsatellite sequences found in the gmome spaced 
closely about the locations picked out by screening studies performed using the screening kit. 

10 

The invention also provides a method of analyzing genomic DNA for the presence of 
polymorphisms comprising: extracting DNA from a human sample; combining, in a polymerase 
chain reaction (PGR) vessel, an aliquot of the extracted DNA, at least one prim^ pair selected 
from one of the GROUPS described above, and PGR amplification enzymes; cycling the 

15 temperature of each PGR vessel to produce PGR products that can be identified with the primer 
pair whose sequence corresponds to unique sequence in the amplified DNA, using an annealing 
temperature at which non-specific annealing is minimized; then combining all PGR products 
from all PGR vessels containing primer pairs from a single GROUP into a mixture, and 
subsequently separating the mixture of PGR products electrophoretically by size; and detecting 

20 separated PGR products by fluorescence detection* at wavelengths corresponding to the 
fluorescent wavelength for each of the fluorescent labels in the kit In a preferred embodiment, 
one primer of each primer pair is labelled with a fluorescent label and the other primer in the 
pair is labelled with biotin, and a mixture containing all PGR products corresponding to the 
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piimer pairs from a single GROUP is prepared by binding the PCR products to a plurality of 
paramagnetic beads carrying on their surface a protein which specifically binds biotin (the beads 
being added to each PCR vessel after amplification), separating the magnetic beads from the 
PCR reaction medium, then separating the two strands of the amplified DNA segments and 
combining the strands labelled with a fluorescent label for all primer pairs from one GROUP 
into the mixture. 

The invention also provides a method for selecting a SET of PCR primers for use in 
automated genotyping comprising selecting at least 6 microsatellite sequences, which contain di- 
nucleotide, trinucleotide or tetranucleotide repeat sequences that are flanked by unique sequences 
in the human genome, and are polymorphic within the population, the microsatellite sequences 
being separated from each olher by at least 2 centimorgans in the genome, and for each 
microsatellite sequence constructing primer pairs having the sequence of the unique sequences 
flanking the microsatellite sequences, so that the primer pairs will direct PCR amplification of 
DNA segmrats corresponding to each microsatellite sequence and the length of all polymorphs 
of the microsatellite sequence amplified by a particular primer pair is detectably different from 
the length of all polymorphs of other microsatellite sequences amplified by other primer pairs 
in the SET. The invention also provides a kit for use in automated genotyping comprising at 
least 10 GROUPS of at least 3 SETS of PCR primers obtained by this method, and a method 
of analyzing genomic DNA for the presence of polymorphisms comprising amplifying DNA 
extracted from a human sample using PCR directed by these primer pairs to produce PCR 
products labelled with detectable labels that are the same for all PCR products from a single 
SET, followed by separating electrophoretically a mixture containing all PCR products amplified 
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from die DNA sample by any primer pair of said SET and characterizing the detectably labelled 
PCR products by loigth. 

The invention also provides a diagnostic method for detection by polymerase chain 
reaction of genomic rearrangement (including deletions, additions, crossovers and gene 
S amplification), of a genomic region containing at least 6 known loci at which genetic 
rearrangement is diagnostic for a disease, using a kit comprising at least one SET containing at 
least 6 PCR primer pairs, the sequences of each primer pair corresponding to the unique 
sequences flanking one of the loci of genomic rearrangement. The primer pairs in the SET are 
constructed so that the PCR product amplified by a particular pair of primers corresponds to a 

10 DNA segment surrounding one locus of rearrangement with length that is characteristic of a 
specific rearrangemmt, and the length of the PGR products amplified by a particular pair of 
primers differs from the length of all other PCR products amplified by other primers in the SET. 
DNA from a sample is amplified in a PCR vessd using the polymerase chain reaction (PCR) 
primed with at least one of the primer pairs of the SET by cycling the temperature of the vessels 

15 with an aimealing temperature that minimizes non-specific annealing to produce detectably 
labelled PCR products, and the PCR products for all primer pairs in the SET are detectably 
labelled with the same label. Labdled PCR products are separated electrophoretically by size 
from a mixture containing all PCR products amplified from the DNA sample by any primer pair 
of the SET, and the sq>arated, detectably labelled PCR products are characterized by length. 
^20 In a preferred mode, all primers in the SET have annealing temperatures within a 4C range, and 
amplification for all primers in the SET is carried out simultaneously in the same vessel. 

The inventor has created a kit comprising SETS of highly polymorphic fluorescent 
primers specific for microsateUite markers that cover the genome at approximately 10 cM 
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intervals for linkage studies. A fluorescence-based protocol based on these SETS has been 
developed for detection of multiple microsateUite markers, and the protocol is accurate as 
compared to a conventional radiolabeling method that depends on a known DNA sequence ladder 
and conventional autoradiography for detection. It has now been demonstrated that genotyping 
5 by semi-automated fluorescence-based techniques is both highly accurate and efficient. We 
routinely type 24 fluorescent markers simultaneously using these techniques in my laboratory. 
The combined analysis of 24 dinucleotide markers in a single gel maximizes the use of 
automated analysis equipment, such as the Applied Biosystems 373A hardware, by producing 
PCR products sufficiently small to run the instrument at least twice daily. The methods 

10 provided herein may improve productivity by more than an order of magnitude and can be easily 
adopted to most linkage studies. 
BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 shows the genetic map of the chromosomal region surrounding a putative 
GENETIC locus. In this example the greater the spacing between marken the more likely 

IS recombination will occur during meiosis. 

Figure 2 shows segregation data from a fabricated three generation family affected with 
a genetic disorder for the four markers illustrated in Figure 1. Squares indicate males, circles 
indicate females. Affected and unaffected family members are indicated by solid and open 
symbols, respectively. Crossovers that have occurred during meiosis are indicated by the 

20 arrowheads. Recombination with markers 1 and 4 from chromosome A exclude a localization 
for the gene causing this disorder in the region immediately above marker 1 and below marker 
4. The region from chromosome A between markers 1 and 4 (including markers 2 and 3) co- 
segregates with the abnormal phenotype in all the affected individuals in this £unily but is not 
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found in any unaffected individuals. These data confirm a localization for the GENETIC locus 
under study to this chromosomal region. 

Chromosomal region 4 of chromosome B from affected individual I-l occurs in both 
affected and unaffected offspring in generation n, showing no linkage. The markers used in this 
5 demonstration approach the ideal by providing maximal genetic information for every individual 
studied. 

Figure 3 illustrates the most common form of simple sequence repeat. In this individual 
the marker is heterozygous, or differs in the number of dinucleotides between the maternal and 
paternal chromosomes. These PCR products would differ in length by 8 nucleotides, and are 

10 each easily detected using gel electrophoresis. The solid bars indicate surrounding sequence that 
is unique (occurs only once in the human genome) and can be used to design PCR primers for 
amplifying this simple sequence repeat. 

Figure 4 shows a cartoon of GROUP 1 markers. Each simple sequence repeat marker 
is idmtified on the left, and the size range for known alleles are noted on the right. Each 

15 marker covers a region of a chromosome to be examined for linkage with a genetic disorder. 
The colored boxes refer to the region on the gel where alleles for each marker may be found. 
Hie markers are chosen to avoid overlap betweoi these regions. For increased efficiency each 
SET is labelled with one of three fluorophores ~ yellow: tetramethyl-6-carboxy-rhodamine 
(TMR), blue: 5-carboxyrfluorescein (FAM), and green: 2',7'-dimethoxy-4',5'-dichloro-6- 
' 20 carboxy-fluorescein (JOE); (red 6-carboxy-rhodamine (ROX) is reserved for internal size 
standards), Applied Biosystems. The products of the PCR amplifications are pooled and 
subjected to the electrophoresis together. Marker data are derived from the Genome Data Base 
(GDB), The Johns Hopkins University, Baltimore, Maryland. 
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Figure 5 shows a typical set of electrophoretograms for GROUP 2 using DNA from a 
single individual. 

Figure 6 shows an electrophoretogram of SET A, GROUP 1 markers from one 
individual. The size (nucleotides) of each PGR product is given on the X-axis above the 
5 electrophoretogram. 

Figure 7 A-M provides a listing of the markers in 13 GROUPS each containing 16-24 
markers divided into three SETS, The fint column gives a locus designation for the marker to 
identify the entry ia the Genbank Data Base which provides the unique sequences surrounding 
the markers. The unique sequence information can be used to design piimers that will direct 
10 PGR amplification of the marker. After the locus designation, the size range of the published 
alleles (in base pairs), the degree of heterozygosity in the population and the chromosomal 
location are listed, in that order, for each marker followed by the nucleotide sequences of 
preferred phmer pairs, along with their annealing temperatures and preferred choice for labelled 
primer. 

15 Figure 8 demonstrates the difference in autoradiographic image produced depending on 

whether the forward or reverse primer is labelled. 

Figure 9 shows an autoradiogn^h of PCR-ampliiied DNA using the primers of GROUP 
2, SET B. The variation in intoisity in products of this SET is typical of this type of marker. 

20 Rgure 10 shows the effect of varying the amount of paramagnetic beads in a magnetic 

bead-based recovery from PGR. 
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DETAILED DESCRIFnON OF THE INVENTION 

Methods for sequencing DNA, for synthesizing oligodeoxynucleotides of defined 
sequmce, and for separating nucleic acid segments by molecular weight using, e.g., 
electrophoresis are well known to those skilled in the art and-well described in the literature, in, 
for example, "Molecular Qoning: A Laboratory Manual," Sambrook, et al., eds., Cold Spring 
Harbor Laboratory Press, 1989. General methods of analyzing DNA by the polymerase chain 
reaction (PCR) including isolation and preparation of DNA templates, synthesis and labelling 
of primers, amplificadon, and analysis of PCR products are also well known and described in 
the literature, for example in Sambrook, et al., 1989, or in "PCR Protocols: A Guide to 
Methods and Applications," Lmis, et al., eds.. Academic Press, 1990. The skilled worker in 
this art is familiar with these and other methods of manipulating and analyzing DNA, and 
routine application of such methods within the skill of the ordinary skilled worker is assumed 
in the following description. 
SenuhAutomated Genotyping: 

Despite the improvements in linkage techniques introduced by PCR and SSRs, genotyping 
remains highly technical, time consuming, and expensive. The application of fluorescence-based 
technology is one way to fiirthCT reduce the cost and increase the efficiency of this type of 
project. Fluorescent labeling of PCR-based markers provides many potential advantages over 
radio-labels (e.g. , ^) and other labels in conmion use for PCR markers. Fluorescent labels are 
nontoxic, stable, and can be combined and analyzed together in a single electrophoretic lane 
(multiplexing) to provide a many-fold increase in efficiency over standard methods of detection. 
Fluorescence signals are linear over a much greater range of intensity than conventional 
autoradiography and other methods of detection in use, providing a better means of 



SUBOTME SHEET (ROLE 26) 



wo 95/15400 



PCTAJS94/13945 



- 16 - 

distinguishing between alleles and artifact. Band intensity provides an objective method for 
distinguishing between alleles and artifacts and may also provide a better means for identifying 
the products of microsatellite markers that frequently vary significantly in intensity. 

Ultimately, real-time fluorescence detection methods may provide a substantial increase 
in efficiency over standard methods of detection based on radiolabeling. A much larger range 
of product sizes can be resolved on each gel nin as compared to racUoIabeling techniques because 
with the automated, real-time equipment such as the Applied Biosystems Inc., the PCR products 
pass by the detector toward the bottom of the gel where the band resolution is greatest. 
Efficiency is further improved by the potential real-time semi-automated detection of alleles. 
In addition, internal size standards are easily incorporated for reproducibility and the accurate 
sizing of alleles, avoiding day to day variability. Computerized data acquisition and handling 
further aid productivity and reduce errors in data entry and manipulation. Ultimately, 
automation is likely to occur more rapidly with fluorescence-based techniques then witii other 
methods of labeling and detection. 

As an initial test of the fluorescence technology, a study was conducted comparing the 
accuracy and reliability of these metiiods wiUi end-labeling (see Example 1). Three markers 
were chosen because tiiqr produce PCR products of tiie same size range. Products of PCR 
reactions run with primers complementary to the unique sequences on either side of the SSR for 
these markers were obtained using primer pairs in which one primer of each pair was conjugated 
to a fluorescent label. These PCR products were electrophoresed simultaneously in a single 
electrophoretic lane to test if these genotypes could be accurately determined. Similar to the 
report by Ziegel, et al., 1992, there was no difficulty in discerning PCR fragments of the same 
size labelled with different fluorophores. 
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Determining the size of DNA ftagments accurately is critical to genotyping in a number 
of applications. When pairaital alleles are available, a simple comparison can determine which, 
if either, paiwital allele has been passed on to a child. However, frequently in linkage studies 
the parratal alleles are not available for comparison, and paternity must be questioned. This is 
also true in DNA forensics, where an unknown must be compared with many others and its size 
determined unambiguously. The analysis of PCR products that differ grossly in concentration 
is complicated by bandshifting and other gel related artifacts. The accuracy of this typing 
procedure must be based on empiric studies of reproducibility using "known" samples as 
standards. Non-polymorphic internal size standards can be used to remedy these problems 
(Lander, 1991). 

Example 1 demonstrates the accuracy of sizing miciosateDite PCR products using a 
fluorescence-based approach as compared to a conventional radiation-based method using a 
known sequence ladder. DNA templates may be obtained from the collection of Centre d'Etude 
du Polymorphisme Humaine, Paris (CEPH) for use as a standard set of aUdes to compare these 
techniques, because there is little question of the genetic identity of each of the individuals in 
tiiis collection. To avoid ambiguity in genotyping with the fluorescent method, fractional size 
estimates should preferably be accurate to within 0.5 nucleotides. Variation greater than this 
could lead to confusion during band matching, afto rounding up or down for size estimates 
provided as a fraction of a nucleotide. Since our analysis suggests that the miaximum variation 
is likely to be less tiian 0.5 nucleotides (and generally significantiy less), the metiiod will be 
useful in the intended applications. 

As shown in Example 1, no sizing errors occurred with the use of the multi-color 
fluorescence-based technique, showing that this metiiodology is highly accurate and reproducible 
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for scoring microsatellite markers. Since the only sizing error resulted from the use of the 
conventional radiolabeling technique, the fluorescence-based protocol appears at least as accurate 
as the conventional method. Therefore, this approach appears to adequately compensate for gel 
distortion and dye related artifacts as compared to radiation labeling techniques. 

Accordingly, the advantages demonstrated for fluorescence-based techniques may be 
exploited by the method of this invention, which uses at least 6 highly informative SSR markers 
assembled into a ladder which we have designated a "SET". Each SSR marker is characterized 
by PGR primer pairs which have the same sequence as a portion of the unique DNA sequence 
on the 5' side of the sense and antisense strands, respectively, encoding the repeat sequence at 
a particular point in the genome. When the genetic material of a particular individual is 
amplified by PGR using one of these primer pairs, a segment of DNA corresponding to the 
sequence of the particular SSR and its unique flanking sequences is produced (the PGR product). 
The size of tiie PGR product is dq)endent both on how much of the unique sequences are 
covered by the primers in die pair and on the number of times the repeat sequence is repeated. 
The number of repeats of the simple sequence at a particular locus varies between individuals 
(polymorphism), and this polymorphism results in PGR products of varying size for differCTt 
individuals- Thus the size of the PGR product can be used to determine if two individuals have 
an allele in common at the genetic locus of the SSR marker. 

The spacing in the gel between PGR products identified with different markers is critical. 
By carefully selecting the length of the primer sequences for each marker, the PGR products 
corresponding to each marker in a SET are spaced a critical distance from siurounding markers 
such that none of the PGR products for the largest known alleles of one marker overlap in size 
with PGR products for the shortest known alleles of another marker in the SET when separated 
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on a 6% denaturing acrylamide gel. An additional safety margin should be provided, because 
rare undocumented alleles (larg^ or smaller) may occur for any given marker. Size spacing of 
less dian 9 nucleotides between dinucleotide SSR markers increases the likelihood for overlap 
because 2-4 stuttering bands (each 2 nucleotides apart) below the smallest allele of one marker 
may overlap with the largest allele of the marker below it. PGR products for trinucleotide 
rejpeat sequences and tetraiiucleotide repeat sequences are not observed to exhibit stuttering 
bands, so the minimum separation distance above and below the largest and smallest known 
alleles can be less for tri- and tetranucleotide repeats. Usually, PGR products for trinucleotide 
repeats in a SET will differ by at least 5 base pairs, and for tetranucleotide markers by at least 
6 base pairs. Preferably a SET will contain 7-9 SSR markers, most preferably 8-9 markers. 
The upper limit on the number of markers in a SET is dependent on the length of the 
electrophoretic separation. 

The PGR product of each primer pair in the SET is tagged with the same label, 
preferably a fluorescent dye. Usually a fluorescent label is covalwtly attached to one of the 
primen in a primer pair. Altematively, the PGR product may be uniformly labelled by adding 
one or more fluorcscently-labdled nucleoside triphosphates to the PGR reaction. Labelling of 
the primers may be accomplished by including a fluorescently-labelled nucleotide during 
synthesis of the primer or by linking a fluorescent label to the primer after synthesis. 
Fluorophore labels for attachment to nucleic adds, including PGR primers, are readily available 
in the art. (See, e.g., Nagaoka, et al., (1992) Chem. Pharm. BidL, 4Q:2559-2561; Giusti, et 
al., (1993) PC3e Afer/iodi^p;?/,, 2:223-227; Alexandrora^ ei^L., Nucleic Acids Symp. Ser. 1991, 
p- 277; Schubert, et al., (1992) DNA Seq., 2:273-279; Vu, et al., (1990) Tetrahedron Lea., 
21:7269-7272.) Usually the labels contain coupling groups that react with modified nucleotides 
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of the PGR primers to form covalent links. Attaching such fluorophores to the primers in the 
SETS of this invention is easily within the skill of the ordinary worker. See, e.g., Levenson 
and Chang, 1990, "Nonisotopically Labelled Probes and Primers," in PGR Protocols, Innis, et 
al., eds,. Academic Press, NY. Fluorescent labels with non-overlapping emission spectra are 
also available commercially, for example, from Applied BioSystems, Inc., including 5-carboxy- 
fluorescein OPAM-blue), 2',7'dimethoxy-4\5'-dichloro-6-carboxy-fluorescein (JOE-green), 
N,N,N\N'-tetramethyl-6-carboxy-rhodaraine(TMR-yeUow),and6-^ 
red); from Biological Detection Systems. Inc., Pittsburgh, PA (BDS) including nucleoside 
triphosphates coupled to cyanine dyes that fluoresce in the greai or orange region, or Boehiinger 
Mannheim Gorporation Biochemical Products, Indianapolis, IN, including fluorescein-5(6)- 
caiboxamidocaproxyl-dUTP (yellow), 7-hydroxy-coumarin-3-carboxyl-dDTP (blue), and 
tetramethykhodamine-5(6)-aniino-thiono-dUTP (red). 

Additional suggestions for sdectihg labels with non-overlapping fluorescent spectra and 
derivitizing oligonucleotides, with them can be found in Smith, et al. 1986, Namre, 221:614- 
679, incorporated herein by reference. Alternatively, primers (or PGR products) may be 
labelled with biotin (see, e.g., Innis, et al., "PGR Protocols," Academic Press, NY, 1990, pp. 
100-103) and then streptavidin coupled to a particular fluorescent dye added to all of the PGR 
products of a particular SET. Variations of these labelling methods or similar methods known 
to those skilled in the art may be used, so long as all PGR product for markers in one SET are 
labelled with the same label. 

SETS, each labelled with a different fluorophore, can be pooled into a collection of 
markers that we have termed a "GROUP." The number of SETS in a GROUP will depend on 
the availability of distinct labels. PGR products for each SET in tiie GROUP will usually be 
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labelled with fluorophores that emit light at a wavelength substantially different from the 
wavelengths emitted by fluorophore labels of the other SETS in the GROUP, where 
"substantially differmt" means sufficimdy distinct to be distinguished by the detection means 
chosen for detecting PGR products after electrophoresis. For example, three commercially 
5 available fluorophores, referred to as TMR, FAM, and JOE (Applied Biosystems), have 
different colors which are yellow, blue, and green, respectively. 

Using this approach we have analyzed as many as 24 SSR markers in a single 
electrophoretic lane using three distinct fluorescent labels to label three SETS in the GROUP 
(see e.g. Fig. 4). In a preferred mode, these fluorescent PGR products may be separated on an 

10 automated electrophoresis systems, such as the AppUed Biosystems 373 sequencer with intmial 
size standards in each lane (labelled, for example, with ROX (red dye), Applied Biosystems) and 
analyzed using, e.g., GeneScan 672 software (Applied Biosystems) (Ziegle, et al., 1991, Miami 
Short Rq>., 1:70) and scored using GENOTYPER software (Applied Biosystems), with data 
displayed as an electrophoretogram or in a spread sheet format. Gel band fluorescent intensities 

15 and peak areas provide an objective method of distinguishing alleles from artifect (stuttering 
bands). A typical electrophoretogram from a single individual for SET A GROUP 1 is 
illustrated in Figure 6. 
Marker Selection and Deyeiopment: 

The human genome is estimated to be approximately 3000 cM in length. Therefore, to 
, 20 adequately "cover" the entire genome at 10 cM intovals will require approximately 300 highly 
informative well spaced markers. An alternative estimate obtained by summing the meiodc 
maps from all the chromosomes suggests that the genome is 2q)proximately 5000 cM in Iragth 
(Nm/CEPH Collaborative Mapping Group, 1992, Science, 252:67-86). Adequate "coverage" 
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of the entire genome based on this size estimate at 15 cM intervals (which would allow testing 
for linkage without using a prohibitively large number of families) will require about 333 highly 
informative well spaced markers. 

Characteristics of preferred markers can be summarized as follows: unique sequence"^ 
5 surrounding the marker is available for use in designing primers, they have been sized 
accurately, the heterozygosity value is known, and each marker has been carefully localized. 
Over 1000 SSR markers, including the surrounding unique sequence and chromosomal location, 
have been described to date in the Genome Data Base (GDB), October 19, 1993, The Johns 
Hopkins University, Baltimore, Maryland. In contrast to older approaches, such as RFLP, many 

10 of the preferred SSR markers are heterozygous (alleles differ at a particular locus) > 50% of the 
time and therefore are highly informative for linkage studies. Each allele of the markers used 
in the method of this invention will be easily detectable after amplification by PGR as a 
predictable component of a complex image or signature by 5' end labeling with^^, labeling 
with fluorescence, or by a variety of other methods. Most preferably, the markers also produce 

15 an easily scored product or simple pattern of stutter bands that are the signature of 
. mononucleotide and dinucleotide repeats. 

Most dinucleotide repeats produce two or three smaller less intense products or "stutter 
bands" (Weber, 1989). These are artifacts produced during PGR, and are less common in PGR 
of tri-and tetranucleotide repeats. Although these stutter bands have been generally considered 

20 undesirable, they can be quite helpful to the investigator (or computer) during the scoring of 
genotypes by allowing for the identification of 'false' bands (background bands due to non- 
specific annealing). Each allele can then be easily scored by 5' end labeling with or 
fluorescence after amplification by PGR, as a predictable component of a complex image. 
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Background bands are generally not associated with stuttering artifacts. Because artifacts due 
to nonspecific annealing are difficult to eliminate entirely from a PCR reaction, the adaptation 
of a similar protocol for the multiplex semi-automated genotyping of tri-, and tetranucleotide 
repeats may be more problematic. The method of this invention reduces artifacts due to non- 
specific anneaUng by control of the annealing temperature for respective primers during 
temperature cycling. 

The use of dinucleotide SSR is preferred in the method of this invention, because the 
potential advantages for automated genotyping may not be so easily incorporated into practice 
for mono-, tri- and tetranucleotide repeats. PCR products of trinucleotide and tetranucleotide 
repeats lack the unique "stuttering" signature of dinucleotide repeats, making it difficult for the 
compute to distinguish real alleles from artifacts produced by nonspecific annealing during 
PCR. Although a simple set of PCR products are produced as alleles (little or no stuttering) 
from tri- or tetranucleotide SSRs, it is often difficult to eliminate other PCR artifacts completely. 
These PCR artifacts are not easily distinguished from "false" bands when large numbers of PCR 
products that vary significantly in intensity are combined as described by this method. The 
unique signature derived from the stuttering bands of dinucleotide repeats provides a simple 
means of distinguishing real products (alleles) from artifactual bands. 

Furthermore, the cost of the hardware is generally considered the limiting factor when 
adopting the fluorescent approach. Tri- and tetranucleotide markers generally require a 
significantly larger fraction of each gel because alleles span a much larger size range. Thus 
longer run time is required, and fewer markers can be resolved per gel. The cost of the 
hardware becomes readily afford^le if one considers the utility and throughput of such an 
instrument when used according to the method of this invention. However, the use of fewer 
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markers per lane (i.e., tetranucleotide repeats) would substantially reduce the cost effectiveness 
of the hardware by reducing efficiency. 

Finally, far fewer of tci- and tetranucleotide markers have beer, fully characterized at 
present. Thus, the availability of well-characterized primers which can be assembled into SETS 
5 and GROUPS remains another limiting factor at present. 
Construction of Marker SETS: 

The selection of markers for inclusion in each SET is based on the need to: maximize 
heterozygosity values (genetic informativeness), place the marker within a SET based on the size 
of the PGR products (alleles produced must not overlap with those of the marker above of below 

10 it), and the location of the marker in the genetic map (ideally we would have 450-500 markers 
placed 10 cM or less apart). The PGR products corresponding to markers within a SET are 
sized to assure that infrequent alleles and stuttCT bands do not produce overlap between the 
markers (compare e.g.. Figures 4 and 6). PGR products for SETS of dinucleotide markers 
differ by approximately 9 nucleotides, preferably, at least 10 nucleotides, in length. When 

15 necessary, new oligonucleotide primers based on the unique sequence surrounding a polymorphic 
marker are designed and synthesized to assure that the PGR products do not overlap during 
electrophoresis. 

Figures 7A-M show 289 SSR markers that have been selected and combined into 11 
GROUPS of 21-24 markers and 2 incomplete GROUPS of 16 markers so that markers in each 
20 GROUP can be separated and analyzed simultaneously. The selected markers cover the genetic 
map on average once every 10 cM. Most are heterozygous greater than 70% of the time. In 
a preferred embodiment, each SET is composed of 8 markers from multiple linkage groups (see, 
e-g.. Figure 7B-H). Most preferably, SETS of markers are part of a single linkage group (i.e. 
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a single chromosome), but this may require significant additional labor because fewer existing 
primers will be suitable. 

Additional or alternative SSR loci to assemble into GROUPS of markers may be found 
in GDB. Loci listed in GDB can be arranged on the genetic map by using map location 
5 information in GDB. Additional or alternative primers may then be designed using information 
on the surrounding DNA sequence available in Genbank, based on the locus designations from 
GDB. GROUP 1 markers (Figure 7A) are currently performing well in multiple laboratories. 

In many cases new oligonucleotide primers must be designed from the sequence 
surrounding each marker to produce FCR products that fit between the products of the markers 

10 above and below it without overlap. The new primers can readily be designed fiom the known 
sequence surrounding the SSR. Criteria for selecting a sequence to be synthesized as a PGR 
primer are well known (see, e.g., Sambrook, et al., and Innis, et al., especially p. 9). 
Preferably, the unique primer 3' sequence should contain at least 7 nucleotides, the A G 
threshold should be at least -1.0 kcal/mol, most preferably -1.4 kcal/mol, and duplex formation 

15 should be avoided, the maximum length of duplex not exceeding 2 base pairs. The sequence 
of preferred pnmen will also minimize or eliminate self-complementarily, hairpin formation, 
and false priming. Once the sequences of candidate primers are chosen, synthesis is readily 
accomplished by standard methods (see, e.g., Sambrook, et al.). 
Optiniization of PCR Conditions and Appearance on the Gel: 

20 These new primers must be tested to assure that they produce an easily scored collection 

of products of the correct size. Scoring may be easier if the label is on one primer rather than 
the oth^ for particular markers (see, e.g.. Figure 8). Primers developed for dinucleotide 
markers may perform well in the PGR reaction, but produce products unacceptable for 
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genotyping (single base stuttering bands, stuttering bands of equal intensity with true alleles, or 
stuttering bands that are larger than the correct allele), and such primers should be avoided. 

For best results, the PGR conditions for each marker should be optimized to eliminate 
any artifactual PGR products due to nonspecific annealing that may complicate the analysis of 
a GROUP of combined markers. In particular, the temperature of the annealing phase of each 
PGR cycle should be optimized for each primer pair. Accordingly, the annealing phase 
temperature is set relatively high, so that specific hybridization occurs, but non-specific 
hybridization between the template DNA and the primers is mininuzed. Usually, the selectivity 
provided by this optimization is preserved in the method of this invention by limiting the number 
of primer pairs in any PGR reaction vessel to those whose optimized annealing temperature is 
the same or nearly the same. Preferably, all primer pairs in the same PGR vessel have 
annealing temperatures within 4G of each other. At one extreme, an entire 96 well plate is 
dedicated to PGR reactions using primers for a single marker, (When genotyping is preformed 
for a large number of individuals, using a separate plate for PGR reactions for each marker will 
not reduce efficiency.) Alternatively, each PGR vessel on a plate has only one primer pair, but 
the plate contains vessels having different primer pairs, so long as all primer pairs on the same 
plate have annealing temperatures within 4G. In a preferred mode, all of the primer pairs for 
a SET or even a GROUP are constructed to have optimized annealing temperatures in a narrow 
range, most preferably 4X, and all of the primers are present in a single PGR reaction vessel, 
obviating the need to mix the individual PGR products prior to electrophoretlc separation. 

In addition, each marker should be evaluated to assure it is sized correctly within the SET 
and that the alleles can be easily scored as distinct products. Furthermore, reported 
heterozygosity values are usually verified using a population of unrelated individuals. The same 
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DNA templates provided herein may be used as controls for verification of protocols and quality 
assurance. Preferred controls include CEPH parents (BIOS corporation, New Haven, Conn.; 
Cell Repository, Camden, NJ.), such as families 1331, 1347, 884, for which reference alleles 
are known (see, Weber, et al., and Genethon Microsatellite Map Catalog, Genethon Human 
Genome Research Center, Evry, France). Pooled DNA from volunteers who have donated 
blood that has been purified as described in the EXAMPLES may be used as well. 

This optimization process requires the synthesis of oligonucleotide primers, dilution and 
aliquoting of primers, identification of the appropriate aimealing temperature (T^ and PCR 
protocol, electrophoresis of the products, autoradiography and data analysis. If labelled primers 
are used for detection of products, S' end labding of both primers should be tested to determine 
which one produces the best image*. The size of the PGR products from each marker should 
be verified experimentally to assure that it does not overlap with the products of the surrounding 
markers in the same SET. As a control for this purpose, PCR products fit)m a pool of DNA 
samples from a population of unrelated individuals may be electrophoresed against a DNA 
sequence ladder. In a preferred mode the test pool will contain at least 50 chromosomes. 

Liitial characterization of primers for each SSR marker may be performed with labels 
because this is less costly, but the smooth adaptation of fluorescent-based techniques for 
genotyping with markers that have been optimized using is also dependent on assuring the 
PCR products labelled with a fluorescent dye perform as expected during PCR and analysis. 
Therefore, the reliability of the developed protocol should be checked by electrophoresis of 
DNA samples labelled by PCR with the fluorescent labels. 



Frequratly the image produced by labeling one of the pair of primers is blurred, see, 
e.g.. Figure 8. 
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The PGR products of different microsatellite markers frequently vary significantly in 
intensity (see, e.g., Figure 9). The sizing of fluorescent PGR products of grossly different 
concentrations is potentially complicated by sample overloading, causing spectral interference 
between the dye labels during analysis. There was no interference in the detection of the 
5 overl^ping products using the four dyes in Examples 1 or 5, because the concentration of each 
PGR product was determined and adjusted to prevent overloading. However in our experience 
this can become a problem when working routinely with 21 to 24 pooled markers. 

Overloading can lead to artifacts that become especially troublesome when they are 
interpreted as internal size standards. To prevent the inaccurate sizing of the products by the 

10 GeneScan 672 software, we have found that the selection of the standard peaks must be canied 
out manually. During large scale applications, such as in our linkage studies, this may become 
a serious problem. Moreover, it is often impractical to estimate the concentration of each of the 
fluorescent products in ordi^ to adjust the concentration of the individual samples to be pooled. 
Generally adjustments in the volumes for each marker can be made for all the samples by 

15 estimating the relative intensity of the marker within a SET. This is easily accomplished by 
referring to the data table of fluorescent band intensities or by viewing the electrophoretogram 
directly. 

In a preferred mode, PGR products are recovered and combined into a mixture containing 
the GROUP by a sinqile protocol that uses magnetic separation technology to purify the 
20 fluorescent PGR products and which restricts the total amount of product pooled to prevent 
overloading. Magnetic sqiaration provides simple separations based on specific binding 
interactions without the need for expensive centrifuges. Saturation binding to a limited amount 
of paramagnetic beads can be used to control the amount of labelled PGR product carried 
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forward in the analysis. Relative intensity may be adjusted by this means and overloading may 
be avoided. 

In a preferred embodiment, one primer is labelled with a component that will bind to 
magnetic microbeads, for example biotin-labelled primers will bind to streptavidin-coated 
magnetic beads. Methods for labelling primers with biotin are taught in, e.g., Innis, et al., 
"PGR Protocols," 1990, pp. 100-103 and references cited therein. Magnetic beads coated with 
streptavidin are commercially available (Dynabeads'^O and procedures for separation are 
described in, e.g., "Magnetic Separation Techniques Applied to Cellular and Molecular 
Biology," Kemshead, etal., eds., Wordsmiths' Conference Publications, Somerset, U.K., 1991. 
A fixed amount of magnetic beads are added to the PGR reaction after amplification using 
primers that will bind to the magnetic beads. The magnetic beads widi the PGR product 
attached are separated from the remainder of the PGR reaction mixture, including salts and 
unused, detectably-labelled primer, and then the PGR product is recovered from the magnetic 
beads (for example, by separating the strands, leaving one strand attached to the bead and 
recovering the other strand whose primer carries the detectable label). 

Alternatively, the entire PGR product may be labelled by including biotinylated UTP in 
the PGR reaction medium as described by Dennis, et al., 1990, in "PGR Protocols," Innis, et 
al., eds. The PGR product can be bound to the beads for purification from the PGR reaction 
mix and excess primer, and subsequently recovered from the beads by, for example denaturation 
of streptavidin. In another alternative mode, paramagnetic beads which have attached to thdr 
surfaces single stranded DNA corresponding to a part of the sequence of the PGR product may 
be added to the PGR reaction mix at the end of amplification, followed by cycling above the 
melting temperature, reannealing and then s^arating the paramagnetic beads and any other DNA 
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strands annealed to the beads from the reaction mix. Labelled strands can then be recovered 
from the beads, as above. 

Selection of SETS and GROUPS of fluorescent SSR markers covering the human genome 
(approximately 300) can be completed in approximately 6-9 months, using the procedures 
provided hCTein. Preferably, additional fluorescent markers will be developed (approximately 
500 SSR markers) providing a higher resolution tool for gene mapping. The resolution of this 
marker collection will approach 10 cM and will preferably cover the telomeres which will better 
assure linkage detection in complex non-Mendelian disorders like asthma and diabetes. 

The development of a common index set of fluorescent markers that can be used in 
multiple laboratories simultaneously should provide certain advantages in genomic studies. 
T^ing these common index lod in a number of different populations afflicted with the same 
disorder will facilitate the comparison of linkage results and provide the information required 
for the eventual application of these techniques to forensic medicine. 

The method of this invention offers several significant advantages over a similar strategy 
adopted by Diehl et al., 1991, Am, J. Hum, Genet., 47:177. Spacing markers in a SET 
according to this invention avoids overlap, providing improved discrimination among markers 
and between markers and artifacts. As many as eight or more markers may be incorporated into 
a SET. When necessary, new oligonucleotide primers based on the unique sequence surrounding 
a polymorphic marker can be designed and synthesized as taught herein to assure that the FCR 
products do not overlap during electrophoresis. Errors introduced by sample handling may also 
be minimized by storing DNA from each individual to be studied in a 96-well format. Our 
protocol preserves the integrity of a 96 well format including PGR amplifications, product 
pooling, and sample purification, thereby minimizing sample handling and errors introduced by 
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excessive sample manipulations. In a preferred mode, efficiency is further aided by the transfer 
of a row of samples by multichannel pipette. 

The combined analysis of multiple markers maximizes the use of the Applied Biosystems 
373 sequencer or similar automated analysis hardware. Since the capacity of the 373 sequencer 
is 36 lanes per gel, 864 genotypes (1728 alleles) can be analyzed routinely from one gel using 
the semi-automated method of this invention. A typical linkage study would include about 100 
families or about 500 individuals. For a S-year study including about 300 markers, 
approximately 180 gels, or about 3 gels per month, will be required. By using the method of 
this invention, at least 2 gels per day can be run per 373 sequencer. Thus, up to 12 
investigators can be accommodated on one instrument, which substantially reduces the cost per 
investigator. 

The method of this invention can also increase the efficiency of diagnostic studies of the 
genome, when the desired diagnostic procedures involve the detection of genetic changes that 
affect the length of genomic DNA at 6 or more locations. Such changes include additions, 
deletions, intra-and interchromosomal crossover, gene amplification and similar gene 
reanangemmts. The loci of many such rearrangements are known and associated with many 
diseases, especially cancers and metabolic errors inh^ted recessively. PGR using primer pairs 
which direct amplification of a DNA segment including one of these loci can be used 
diagnostically where the rearrangement associated with the disease causes a change in the length 
of the PGR product. A SET of primers designed according to the principles above can be used 
in the production of PGR products that can be analyzed electrophoretically in a single lane, for 
more efficient use of electrophoresis and analysis equipment. 
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EXAMPLES 

The following examples describe particular embodiments within the broader invention. 
These embodiments are described for illustrative purposes only, without intention to limit the 
invention. 

EXAMPLE 1 

As an initial test of the fluorescence technology, a study was conducted to compare the 
accuracy and efficiency of these methods with a conventional radiation-based method. Three 
microsatellite loci producing PGR products that overlap in size were chosen to compare the 
accuracy of genotyping by fluorescence versus radiolabeling. Discrepancies between the 
genotypes derived from each technique were resolved by repetition. To estimate the variation 
in sizing of the fluorescence-based technique certain samples were loaded on 3 or more gels for 
comparison. DNA from CEPH (Centre d'Etude du Polymorphisme Humaine, Paris) famiHes 
884, 1331, 1332, 1333, 1362 were amplified for Marshfield markers, mfd 1 (176-196bp), mfd 
59 (175-195bp), and mfd 154 (186-204bp) using the polymerase chain reaction (PGR). 

Fluorescent techniques: The forward and reverse primen were each labelled at the 5' 
end for detection by autoradiography with p^] 7ATP(6000 CUfimolt) using polynucleotide 
kinase. A primer was selected from each marker for fluorescent labding on the basis of the 
image of the products (see Figure 8). The optimal annealing temperature was selected for each 
marker empirically by selecting a temperature that eliminated nonspecific annealing or artifectual 
(background) PGR products. Huorescent labels were attached at the 5' end via phosphoramidate 
derivitization using Aminolink 2 (Applied Biosystems). Primer B (see Figure 10) for mfd 1 was 
labelled yeUow (TMR), primer A (see Figure 10) for mfd 59 was labeUed blue (FAM), and 
primer B (see Figure 10) for mfd 154 was labelled green (JOE). PGR conditions were: 0.4 ^iM 
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primers, 1.5 ;aM MgClj, 50 /xM Kcl, 200 fiM dNTPs and 0.5 units Tag polymerase (final con- 
centrations); 94**C for 10 min; followed immediately by 30 cycles of 94°C for 30 sec; SS^'C 
(mfd 59, mfd 154) for 30 sec or 60^C (mfd 1) for 30 sec; and 72*^C for 30 sec; followed by 
72**C for 7 min. PCR was carried out in a volume of 12.5 ^1 using 25 ng of CEPH DNA, 
CEPH DNA was stored in a 96 well microtiter plate (Perkin Elmer/Cetus). Amplifications were 
performed in 96 well microtiter plates using a Perkin Elmer/Cetus Model 9600 thermalcycler 
and accessories, maintaining the integrity of the 96 well template. Five microliters were 
combined from each marker for each CEPH individual using a multichannel pipette 
(Transferpette-8, Brinkman). The pooled PCR products were desalted by adding 2 volumes of 
sterile deionized distilled water (ddHjO), ice cold ethanol (100%) equal to the total volume, and 
chilling for 30 minutes at -70^C. The microtiter plate was spun at 4X at 1400XG for 2 hours 
in a Beckman Model GS6R centrifuge. The supernatant was aspirated, die pellet was washed 
once with 1.5 volumes of ice cold ethanol (70%), and die plate centrifuged 30 minutes at 
1400XG at 4'*C. The supernatant was aspirated and the plate was air dried. Pellets were 
resuspended in a volume of sterile ddHjO equal to the starting volume (pool). 

Radiolabdled products were separated by conventional electrophoresis and scored 
manually from autoradiographs. Fluorescent PCR products were separated on a 373 sequencer 
with internal size standards in each lane (GeneScan 2500-ROX; Applied Biosystem) and analyzed 
using GeneScan™ 672 software (Applied Biosystems). Each sample (representing 0.5 a^I of each 
product) was heated to 99*C after adding 1 /J of tiie internal lane size standards (GeneScan 
2500-ROX, Applied Biosystems) and 2 ;il formamide/EDTA loading buffer, until the total 
volume was reduced to 2-3 fiL Electrophoresis was carried out using 6% acrylamide (Biorad), 
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8 M urea (Ultrapure, USB) gels in 1 X TBE. The reduced volume was loaded and run for 4-8 
hours on a model 373 Sequencer (Applied Biosystems) using a 24 cm well to read distance. 

The size of the PCR product is determined by reference to the internal lane size standards 
(Carrano et al. 1989, Genomes, 4:129-136). The size standard ROX-2500 (Applied 
5 Biosystems) including fragments: 37, 94, 109, 116, 172, 186, 222, 233, 238, 269, 286, 361, 
and 479 nucleotides in length was used with modifications. PCR fragments 61 and 68 
nucleotides in length were gd purified, labelled by aminolinking with ROX, and added in equal 
volumes to the ROX-2500 standards. These fragments were added because desalting by ethanol 
precipitation recovers the unused PCR primers with the products. The intense peak produced 

10 by the unincorporated labelled primer is seen in the standards because of interference between 
dyes and obscures the detection of the 37 nucleotide standard fragment. Therefore, we have 
modified the GeneScan-2S00 standards to provide a fragment of known size labelled with ROX 
to accurately estimate the length of the smallest alleles. 

The GeneScan 672 (version 1.0) software recognizes any peak labelled with ROX, 

15 computes a calibration curve based on a second-order least-squares fit, and uses these data to 
estimate the allele sizes of the PCR products (Ziegle et al. 1992). Data from each lane can be 
analyzed independenfly, or four lanes of data for a single fluorescent dye can be displayed 
simultaneously to compare individuals within a family. Allele sizes in nucleotide bases, the 
graotypes, are assigned by interactively distinguishing major peaks from background artifacts. 

20 The scale on the display can be adjusted to analyze peaks with differences in fluorescent 

intensity. The intensity of each fluorescent band and peak areas provide an objective method 

of distinguishing alleles from artifact (including stuttering bands). Allele sizes can be transferred 

to a spreadsheet database for linkage or a multicolor electrophoretogram. 
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mfd 1, mfd 59, and mfd 154 PGR products overlap in size (175-204) bp (see Figure 10). 
There was no evidence of interference between the dyes even when there was complete overlap 
during the electrophoresis of PGR products, similar to that reported by Ziegel et al., 1992. In 
our experience, interference between dyes does become a problem with overloaded samples. 
A comparison of the genotyping results of the radioactive and fluorescent labeling methods 
revealed 4 discrq)ancies out of 462 possible comparisons (alleles) (see Table 1). One 
transcription error occurred in the manual data manipulation of the fiuorescently labelled 
products. There was no interference between fluorophores with the detection of the overlapping 
products using the four dyes. No sizing errors were attributed to the fluorescence-based 
technique and each marker displayed Mendelian inheritance. The average size variation across 
all comparisons was 0.28 nucleotides. However, the maximum difference (range) found for any 
of the 462 comparisons was 0,47 nucleotides (see Table 2), GeneraUy sizing varied less within 
a gel than between gels. The variation in the size of the alleles was similar when comparing 
each of the individual markers. The remaining discrepancies occurred with the use of the 
standard radioactive-based protocol and represented an error rate of less than 1 % . Inaccurately 
sized PGR products and sample misloadings produced mistypings with the conventional 
technique (see Table 1). In general, fluorescent internal size standards provided more precise 
sizing than did radiolabeling. These data demonstrate both improved accuracy and efficiency 
for typing SSR markers with use of fluorescence-based techniques. 
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TABLE 1 



CEPH 
DNA/Marker 


Genotype 
Radiolabelled 


Genotype 
Fluorescence 


Explanation | 


884-18/mfd 1 


178,192 


178,194* 


size estimate enoT 


1331-16/mfd 59 


179,179 


179,185* 


gel loading error 


1331-17/jnfd 59 


179,170 


179,185* 


gel loading error 


61332-15/mfd 154 


185,200* 


200,200 


recording error 

2 II 



10 * indicates correct score by length in nucleotide residues 

TABLE 2 

15 



COMPASISON 


1 RANGE 


[in nucleotides) 




Maximum 


Average 


Standard Deviation 


intergel"' 


0.47 


0.28 


.08 


intragel"' 


0.42 


0.18 


.07 


mfd 1^ 


0.35 


0.19 


0.1 


mfd 59'" 


0.37 


0.15 


.08 


mfd 154"" 


0.42 


0.23 


.06 











25 Superscripts indicate number of samples 
EXAMFLE2 

Mapping with Fluorescent Primers 

Genomic DNA is isolated as described by M.J. Johns, et al., Anafytical Biochem,, 
30 122:276-278 (1989). 

To minimize sample handling, DNA templates can be stored in a 96 well grid (e.g., 
Perkin Elmer/Cetus). The integrity of the grid may be maintained throughout the protocol to 
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avoid errors introduced by manual pipetting and sample handling. Multichannel pipetting from 
a 96-well grid expedites sample handling while minimizing human errors. 

PCR is performed in a reaction volume of 12.5 ;tl, containing SOfiM dATP, dGTP, 
dTTP, dCTP; 0.07/iM of the labelled oligonucleotide primer, and 4 ^ of the unlabelled 
primer. Taq polymerase (Perkin-Elmer\Cetus) 0.5 units is added on ice. PCR will usually be 
performed in a thennalcycler, e.g., a Perkin-Elmer\Cetus 9600 thermalcycler. Standard 
thmnalycycler settings are 94**C for 10 minutes, followed by 30 cycles 94**C for 30 seconds, 
30 seconds at average annealing temperature for the primers and 72*C for 30 seconds; final 
extension is at 72**C for 7 minutes. 

Labelled PCR products are purified by co-precipitation in EtOH. 24 markers may be co- 
precipitated simultaneously in the 96-wdl format using ethanol. Ethanol precipitation desalts 
the products but copurifies the primers. The labelled primer peak produces an enormous signal 
that complicates the analysis of products under 93 nucleotides in length because it interferes with 
the 37 nucleotide ROX GeneScan-2500 standard. As an alternative, internal standards may 
incoiporate fragments that are 50, 60, and/or 70 nucleotides in length in addition to the 
GeneScan 2500 standard fragments or an equivalent set of fragments. 

The amplified products are analyzed by denaturing gel electrophoresis (Sambrook, et al.). 
Loading buffer (2X concentration) is added to an equal volume of the PCR reaction, and the 
PCR reaction is loaded on a 6% polyacrylamide gel. Radioactive products will be sized against 
a sequence ladder; the gds are dried and then exposed to Kodak XAR film for 4-24 hours with 
or without intensifying screens. Fluorescent labelled PCR products may alternatively be 
analyzed by semi-automated detection using, e.g., an^ABI 373A automated sequencers and 
GeneScan 672 software from Applied Biosystems, Inc. 
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EXAMPLES 

PCR products are produced as in Example 2 and then purified and combined for 
electrophoresis using a magnetic bead protocol in place of EtOH precipitation. One of each pair, 
of primers is labelled with biotin and the other with a fluorescent label as above. Double 
stranded PCR products are purified using streptavidin conjugated to paramagnetic beads to bind 
the primer 5' labelled with biotin. This procedure may be easily adapted to the 96-well format 
in any laboratory without expensive centrifuges. After the DNA bound to magnetic beads is 
separated from the PCR reaction media, the two strands are melted and separated, and the strand 
labelled with the fluorescent primer is pooled with other labelled strands of its GROUP for 
electrophoresis. The result of increasing the amount of beads used for separation of a single 
PCR product from its PCR reaction mix is shown in Figure 12. 
EXAMPLE4 

^ OPTIMIZATION OF PRIMER SETS 
DNA Temptetes 

CEPH parents and/or unrelated volunteers as controls may be tested. In addition, we 
usually include one "no DNA" control land one reference individual (alleles known) on each 
plate. To maximize the use of resources, each mark^ may be optimized, using 12 wells or less 
of a 96-well plate. Eight markers are amplified per plate at a single temperature. Alternatively, 
a thermalcycler with a smaller sample capacity may be used. 

The 5' end of the primers to be tested is labelled with ^^P using the polynucleotide kinase 
reaction. Mix 5^* sterile ddHjO, 2.8 fd 5x kinase buffer (250 mM Tris, 50 mM MgClj, 50 mM 
DTT, 0.25 mg/ml BSA), 6.0 fd 10 fiM primer, 0.8 id T4 polynucleotide kinase, and 3.0 /il y^P 
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ATP (6000 Ci/mmol). Incubate at 37" for 1 hour, then add 26 pi sterile ddHjO, spin through 
select D column (Five Prime Three Prime) loaded with P4 Kolgel (BIORAD)' according to the 
manufacturers recommendations. The labelled primers may be stored at -lO'C, 

For optimization, set up simultaneous PGR reactions as described in Example 2, using 
DNA templates described above (e.g., 2 CEPH (1331-1, 1347-02), 1 pooled sample (50 
Chromosomes), 1 no DNA). Perform PCR at the annealing temperature (T) calculated as 
follows 

T = 2(A+T) + {G+Q (If the calculated temperatures for 2 primers differ greatly, for 
example 54" and 64", begin closer to lower T") 

Check the amplified PCR product for artifect by electrophoresis on 6% gel. Continue 
optimization of die selected '^-labeUed primer with control individuals, increasing the annealing 
temperature in 2" increments until nonspecific products are eUminated. On average, 
determinations at approximately 4 T" values are required to optimize each prima-. 

When aU markers from a SET are optimized (usually 8 markers), 3 ^1 from a pool of 
PCR product of DNA from unrelated individuals using primers for each marker in the SET is 
combined with an equal volume of loading buffer (2X concentration). Seven fd (or maximum 
well volume) of the combined mixture is loaded on a gel and electrophoresed. This last check 
on size and product intensity assures that the markers are robust and are spaced about 10 
nucleotides apart. The primer sequences may then be used to synthesize fiuoiescent/biotinylated 
products. 

EXAMPLES 

A protocol extending this approach to include up to 24 microsatdlite markers in each 
electrophoretic lane was tested as foUows. The selection of markers was based on the need to: 

SUBSTfTUTE SHEET (RULE 26) 



wo 95/15400 



PCT/US94/13945 



-40- 

maximize heterozygosity (genetic infoimativeness), distribute markers across the entire genetic 
map, and the placement of the marker within a SET based on the known size of the PCR 
products (alleles and stuttering bands produced must not overlap widi those of the marker above 
of below it). 

5 Highly informative microsatellite markers were assembled into a ladd^ or "SET*. Each 

marker in a SET is spaced a distance of at least 9 nucleotides from surrounding markers such 
that none of the PCR products overlap in size when separated on a 6% denaturing acrylamide 
gel. Since many dinucleotide repeats produce a complex pattern of 3 or more stutter bands, this 
spacing is critical to assure that more intense stutter bands from an upper marker will not be 

10 misinterpreted as a product from a lower marker. In addition, new alleles both larger and 
smalls than the reported product sizes for this type of marker have occasionally been 
discovered. Each SET was labelled with one of three different commercially available 
fluorophores (TMR, FAM, and JOE; Applied Biosystems). The fourth fluorophore (ROX) was 
res^ved for the internal size standard. Three SETS each labelled with a different fluorophore 

15 were pooled into a collection of markers we have termed a "GROUP". 

New primers w^e designed as necessary using OLIGO 4.0 (Research Genetics, 
Huntsville, AL) to fit within the marker ladder. Each GROUP was constructed to avoid overlap 
between markers within SETS but to allow overlap between SETS. 

The autoradiogr^hic image produced by many markers varied depending on whether the 

20 forward or reverse primer was labelled (see Figure 8). Therefore, both primers from each 
marker were evaluated for image clarity and the ability to distinguish the most intense product(s) 
or alleles. The appropriate primer was then selected for further use. Optimization of the PCR 
conditions for each marker was also accomplished using radiolabeling. The strategy of 
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developing a ladder of markers warranted that the conditions for PCR eUminate nonspecific 
annealing and background bands. When nonspecific annealing could not be eUminated by raising 
the annealing tanperature, a new marker was chosen for use. Thus uniform PCR conditions as 
described in Example 1 were used for all the markers chosen except that the annealing 
temperature was specific to each marker. GROUPS 1 and 2 have 6 and 9 different annealing 
temperatures, respectively (see Figures 7A and B). An entire microtiter plate containing DNA 
from a number of different individuals will usually be amplified for a given marker at one 
temperature at a time, so this should not reduce the overall efficiency of the protocol. For 
studies with fewer samples a thermalcycler block may be used with a lower capacity. 

Variability among thermalcycler operating temperatures may require adjusting the 
annealing temperature when switching from one machine to another. Therefore die use of the 
protocols described for marker GROUPS 1 and 2 should be preceded by a reevaluation of the 
suggested annealing temperatures for optimal performance. This can generaUy be earned out 
once on a few markers and when necessary the annealing temperatures can be adjusted up or 
down for all the markers for that machine. 

The intensity of the products varied considerably from marker to marker. When markers 
were radiolabelled and a SET was run on the same gel, detecting all of the products on the gel 
with a single film exposure was often impossible. Attempts to score on a single gd the larger 
products in each SET using radioactive-based techniques were unsuccessfiil. Although gradient 
gels improved the band spacing, a maximum of 4-5 markers could be resolved per gd on 
autoradiographs. An autoradiograph of GROUP 2 SET B is shown in Figure 9. The range of 
intensity in the products of this SET is typical of this type of marker and multiple 
autoradiographs are required for genotyping. These problems are partially overcome by the use 
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of fluorescent labels (Ziegle et al., 1992). Fluorescent signal detection is linear over a greater 
range, so that the niarkers with the weakest pioduct intensity are more readily typed in real-time 
along with the most intense products ftom other markers. 

Marker GROUPS 1 and 2 are described in Figures 7A and B, respectively. The primers 
sequence, chromosomal location, choice of labelled primer, and optimal annealing temperature 
is listed for each locus. GROUP 1 is composed of a combination of 21 di-, tri-, and 
tetranucleotide markers from multiple linkage groups. The product sizes range from 66 to 322 
nucleotides. Group 2 is composed of 24 dinucleotide markers with products ranging in size 
from 75 to 349 nucleotides. The mean heterozygosity for both GROUPS is 74%. 

Scoring of the fluorescent products using the ABI 373 sequencer and GeneScan 672 
software was unambiguous in samples that were desalted by ethanol precipitation. Desalting was 
carried out as follows: 5 ^1 of each PGR product from the same SET (like color) was combined. 
Then 1.0 /il per marker per SET was combined for each of the 3 SETS giving a final volume 
equal to the total number of markers in the GROUP. Sample handling was otherwise exactly 
as described above for the individual fluorescent markers. 

A ^ical set of electrdphoretogiams of each SET from GROUP 2 for a single individual 
is illustrated in Figure 5. Each of the alleles can be easily > recognized by the unique signature 
of the stuttering bands for these dinucleotide repeat markers amplified by PGR. Samples that 
were not desalted were difficult to score because the mobilities of the products and the 
ROX-2500 internal lane standards were altwed. Salt and primer loads become a problem when 
combining multiple products for electrophoresis because the necessary volume reduction results 
in sample concentration. The salt concentration rises with the product concentration and 
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interferes with the separation of the products and standards. This becomes critical when pooling 
21 to 24 markers. 

It will be understood that while the invention has been described in conjunction with 
specific embodiments thereof, the foregoing description and examples are intended to illustrate, 
but not limit the scope of the invention. Other aspects, advantages and modifications will be 
apparent to those skilled in the art to which the invention pertains, and these aspects and 
modifications are within the scope of the invention, which is limited only by the appended 
claims. 
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CLAIMS: 

L A kit for use in automated genotyping within a population comprising at least 4 
GROUPS of at least three SETS each comprising labelled pairs of primers for amplification of 
DNA by polymerase chain reaction (PCR), 

each primer pair having imique sequence found in the flanking sequences of a 
microsatellite sequence comprising a nucleotide repeat sequence flanked by unique sequences, 
such that a polymerase chain reaction (PCR) primed with the primer pair amplifies the 
nucleotide repeat sequence and at least some immediately adjacent unique sequences of the 
microsatellite sequence to produce a PCR product identified with the primer pair, wherein the 
microsatellite sequraces are nucleotide repeat sequences that are polymorphic within the 
population, 

each SET consisting of at least 6 primer pairs, each primer having the sequence 
of unique sequences respectively flanking at least 6 microsatellite sequences in the genome, such 
that the length of the segment amplified by a particular primer pair differs fix}m the length of 
all other segments in the SET by at least 5 nucleotides, and at least one primer of each primer 
pair is labeUed with a fluorescmt label that is the same fluorescent label for all primer pairs in 
the SET, 

each GROUP consisting of at least three SETS of primer pairs labelled witii 
fluorescent labels, wherein the wavelengtii at which the respective fluorescent labels fluoresce 
is substantially different for the labelled primers in each of the respective SETS, 

wherein the distance in the genome between one microsatellite sequence amplified 
by a primer pair of the kit and the nearest odier microsatellite sequmce amplified by another 
primer pair of the kit is at least 2 centimorgans (cM) and no more than 50 cM. 
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2. The kit of claim 1, wherein the PGR products identified with any primer pair 
amplifying microsatellite sequences containing dinucleotide repeats differ in length from PGR /- 
products identified with all other primer pairs of the same SET by at least 9 nucleotides. 

3. The Mt of claim 1, wherein one of said GROUPS consists of the three SETs of 
5 Figure 7A. 

4. The kit of claim 1, wherein one of said GROUPS consists of the three SETs of 
Figure 7B. 

5. The kit of claim 1, containing the 6 SETs shown in Figures 7A and 7B. 

6. A method of analyzing genomic DNA for the presence of polymoiphisms 
10 comprising 

a) extracting DNA from a human sample; 

b) combining, in a polymerase chain reaction (PGR) vessel, an aliquot of said 
DNA from a human sample, at least one primer pair selected from a GROUP in the kit of claim 
1, and PGR amplification enzymes; 

15 c) cycling the temperature of each PGR vessel so that PGR products identified 

with said at least one primer pair are produced by PGR amplification of segments from said 
DNA from a human sample, each vessel being cycled at an annealing temperature wherein non- 
specific annealing of the primers to said DNA from a human sample is minimized; 

d) dien combining all PGR products from all PGR vessels containing primer 
20 pairs from one GROUP into a mixture, and subsequently separating the mixture of PGR products 

electrophoretically by size; 

e) detecting separated PGR products by fluorescmce detection at wavelengths 
corresponding to the fluorescent wavelength for each of the fluorescent labels in the kit. 
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7. The methcxl of claim 6, wherein the step of combining amplified DNA further 
comprises: 

i) contacting each vessel with a plurality of paramagnetic beads carrying on 
the surface a protein which specifically binds biotin, further wherein one primer of each primer 

5 pair is labelled widi a fluorescent label and the other with biotin, for a period sufficient for said 
protein to bind biotin; 

ii) separating the magnetic beads from the PGR reaction medium; 

iii) separating the two strands of the amplified DNA segments and combining 
the strands labelled with a fluorescent label for all primer pairs itom one GROUP into a 

10 mixture. 

8. The method of claim 6, wherein the step of combining amplified DNA from the 
PGR vessds further comprises: 

i) contacting each vessel with a plurality of magnetic beads carrying DNA 
complementary to the sequence of one prim^ of the primer pair in the vessel for a period 

15 sufficient to allow annealing between the primer and the DNA on the magnetic beads; 

ii) . separating the magnetic beads from the PGR reaction medium; and 

iii) eiuting the PGR product from the magnetic beads. 

9. The method of daim 6, wherein each primer pair of said kit is added to a 
different PGR vessel in step (b), such that the annealing temperature for temperature cycling in 

20 step (c) is the temperature wherein non-specific aimealing of the unique primer pair is minimized 
and PGR product from all PGR vessels containing at least one primer pair from the same 
GROUP are combined in a single mijcture before electrophoretic separation. 
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10. A method for selecting a SET of PGR primers for use in automated genotyping 
comprising 

selecting at least 6 microsateUite sequences in the human genome, wherein the 
microsateUite sequences are selected from dinucleotide, trinucleotide and tetranucleotide repeat 
sequMces that are flanked by unique sequences, said microsateUite sequences being sq)arated 
from each other by at least 2 centimorgans in the genome and being polymorphic within the 
population; 

constructing primer pairs for each microsateUite sequence, said primers having 
the sequence of the unique sequences flanking the microsateUite sequences, such that the length 
of aU polymorphs of the DNA segment amplified by a particular primer pair is detectably 
different from the length of aU polymoiphs of other segments amplified by primers in the SET. 

11. A kit for use in automated genotyping comprising at least 4 GROUPS of at least 
3 SETS of PGR primers obtained by the method oif claim 10. 

12. The kit of claim 11, wherein at least one primer of each primer pair in the SET 
is labeUed with a fluorescent label that is the same fluorescent label for aU primer pairs in the 
SET. 

13. The kit of claim 11, wherein the length of aU polymorphs of the DNA segment 
amplified by any primer pair amplifying microsateUite sequences containing dinucleotide repeats 
differs in length from the DNA segment amplified by aU other primer pairs of the same SET by 
at least 9 nucleotides. 

14. A method of analyzing genomic DNA for the presence of polymorphisms 
comprising 

a) extracting DNA from a human sample; 

SUBSTITUTE SHEET (RULE 26) 



wo 95/15400 



PCTAJS94/13945 



- 48 - 

b) combining, in a polymerase chain reaction (PGR) vessel, an aliquot of said 
DNA from a human sample, at least one primer pair selected from a GROUP in the kit of claim 
11, and PGR amplification enzymes; 

c) cycling the temperature of each PGR vessel so that PGR products 
S consisting essentially of amplified DNA segments labelled with detectable labels are produced 

by PGR amplification and the PGR products for all primer pairs in the SET are detectably 
labelled with the same label, each vessel being cycled at an annealing temperature wherein non- 
specific annealing is minimized; 

d) sq)arating electrophoretically by size a mixture containing all PGR 
10 products amplified from said DNA from a human sample by any primer pair of said SET; 

e) detecting separated detectably labelled PGR products and characterizing 
them by length. 

IS. The method of claim 14, wherein the mixture in step (d) containing all PGR 
products amplified from said DNA from a human sample by any primer pair of said SET is 
15 obtained by: 

i) contacting each vessel with a plurality of paramagnetic beads carrying on 
the surface a protein which specifically binds biotin, furtho- wherein one primer of each primer 
pair is labelled with a fluorescent label and the other with biotin, for a period sufficient for said 
protein to bind biotin; 

20 ii) separating the magnetic beads from the PGR reaction medium; 

iii) separating the two strands of the amplified DNA segments and combining 
the strands labelled with a fluorescent label for all primer pairs from one GROUP into a 
mixture. 
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16. The method of claim 14, wherein the mixture in step (d) containing all PGR 
products amplified from said DNA firom a human sample by any primer pair of said SET is 
obtained by: 

i) contacting each vessel with a plurality of magnetic beads carrying DNA 
complementary to the sequmce of one primer of the primer pair in the vessel for a period 
suf&drat to allow annealing between the primer and the DNA on the magnetic beads; 

ii) separating the magnetic beads from the PGR reaction medium; and 

iii) eluting the PGR product from the magnetic beads. 

17. A kit for analysis by polymerase chain reaction (PGR) of a genomic region 
containing at least 6 known loci at which genetic rearrangement is diagnostic for a disease, 
comprising at least one SET containing at least 6 PGR primer pairs, 

each primer pair having the sequence of unique sequences flanking one of said 
at least 6 lod of genomic rearrangement, such that a polymerase chain reaction (PGR) primed 
with the primer pair amplifies the DNA segment surroimding the locus of rearrangement to 
produce a PGR product of characteristic length, wherein the length of the PGR product is 
associated with specific diagnostic information, and wherein the length of the PGR product 
amplified by a particular pair of primers differs firom the Imgth of all other PGR products 
amplified by other primers in the SET and the PGR products for all primer pairs in the SET are 
detectably labelled with the same label. 

18. A diagnostic method for detection by polymerase chain reaction (PGR) of genomic 
rearrangement in a genomic region containing at least 6 known loci at which genetic 
rearrangement is diagnostic for a disease, comprising 

(a) extracting DNA from a human sample; 
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(b) combining, in a polymerase chain reaction (PCR) vessel, an aliquot of said 
DNA from a human sample, at least one pair of amplification primers selected from a SET of 
at least 6 primer pairs, and PCR amplification enzymes, each primer pair of said SET having 
the sequence of unique sequences flanking one of said at least 6 loci of genomic rearrangement, 
such that a polymerase chain reaction (PCR) primed with the primer pair amplifies the DNA 
segment surrounding the locus of rearrangement to produce a PCR product of characteristic 
length, wherein change in the length of the PCR product is associated with rearrangement at the 
locus of reairangement, and wherein the length of PCR products amplified by a particular pair 
of primers differs from the length of all other PCR products amplified by other primers in the 
SET; 

c) cycling the temperature of each PCR vessel so that PCR products 
consisting essentially of amplified DNA segments labelled with detectable labels are produced 
by PCR amplification and the PCR products for all primer pairs in the SET are detectably 
labelled with the same label, each vessel being cycled at an annealing temperature wherein non- 
specific annealing is minimized; 

d) separating electrophoretically by size a mixture containing all PCR 
products amplified from said DNA fix)m a human sample by any primer pair of said SET; 

e) detecting separated detectably labelled PCR products and characterizing 
them by length. 

19. The method of claim 14, wherein each primer pair of said SET is added to a 
different PCR vessel in step (b), such that the annealing temperature for temperature cycling in 
step (c) is the temperature whereia non-specific annealing of the unique primer pair is minimized 
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and PCR product from all PCR vessels containing at least one primer pair from said SET 
combined in a single mixture before electrophoretic separation. 
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FIG, 7A-I 
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FIG. 7A-2 

GROUP 1 



A Primer B Primer 



5'-GTC AGC ACC CCA ACC AGC CT-3' 
5*-TCC AGC CTC GGA GAC AGA AT-3' 
5*.GTr AGC ATA ATG CCC TCA AG-3* 
5'-AAG AAC CAT GCG ATA CGA CT-S* 
5'-CAT AGC GAG ACT CCA TCT CC-3' 
S'-CAG AAA ATT CTC TCT GGC TA-3* 
5*-AGC TAT CAT CAC CCT ATA AAA T-3' 



S'-ACC GAA GAC CCC TCC TGT GG-3^ 
5*-AGT CCT TTC TCC AGA GCA GGT-3* 
S'-CGA TGG AGT TTA TGT TGA GA-3* 
5*-CATTCC TAG ATG GGT AAA GC-3' 
5'-GGG AGA GGG CAA AGA TCT AT-3' 
5'-CTC ATG TTC CTG GCA AGA AT.3* 
5'-AGT TTA ACC ATG TCT CTC CCG-3* 



5'-CTG TTA TGG GAC TTT TCT CA-3' 
5'-ATG ACT TCC CCA CTT TTT AC-S* 
5'-ACT TTG AAA ACC ACT GGC CT-3' 
5'-AGC TAT AAT TGC ATC ATT GCA-3' 
5'-ATC TCT GTT CCC TCC CTG Tr-3' 
S'-AAG CTT GTA TCT TTC TCA GG-3* 
5'.GTA TFT TTG GTA TGC TTG TGC-3' 

5'.AAT CTT CTT TTT TGT CTA TGA-3 ' 
5'.GTG CCA TTT TAG AGT CTC CT-S* 
5*.GCT AGC CAG CTG GTG TTA TT-3* 
5'^AG AGG GAG GGC CTG CGT TC-3' 
S'-TTA AAA TGT TGA AGG CAT CTT C-3* 
S'-TTC TGA TAT CAA AAC CTG GC-3' 
5'-AAA AGT GTG TTA CTT TCA GAA C-3' 

SUBSTITUTE 



5*-AAT GTA TGA AGT GGT ATG AT-3' 
5'-GCT GAG ATG GGA GGA TTG CT.3* 
S'-ATG TAT CTA GCC ATG GTA GC-3* 
S'-TGG TCT ATA ACT GGT CTA TG-3* 
S'-CTT ATT GGC CTT GAA GGT AG-3 ' 
5'-ATC TAC CTT GGC TGT CAT TG-3" 
5'-CTA TFT TGG AAT ATA TGT GCC T-3' 

5'-CGT TTG ACT CCG TGT GTT lOK-V 
5*-Trr CCA TTG TCT GTC CGT Tr.3' 
5'-ACC ACT CTG GGA GAA GGG TA-3- 
5'-CAC CCA GGG CCA GAT AAA GA-3' 
5'-TTT GAG TAG GTG GCA TCT CA-3' 
5'-AAG GAT ATT GTC CTG AGG A-3' 
5*-ACA AGG TGA CAA GGT GCC TA-3' 
(RULE 26) 
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FIG. 7A-3 
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FIG. 7B-I 
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FIG, 7B-2 

GROUP 2 

A Primer B Primer 



5'-AAA CCC AAA CCC AGA GGA Tl-V 
5*-GGC ATG TCA TTT TCG TAA GC-3' 
5*-AAT ATG GCT ACA GCA TTG GA-3' 
5*-GAG CGA CAG CAA AAT CAG CC-3* 
5'-ATA TGG AAA CTC TCC GTA CT-3' 
5'-AGT TAC ACC GGT TCT GCA GA-3' 
5*-ACT GCC TCA TCC AGT TTC AG.3* 
5'-TCC TGG CTT TAA ACT TCA CAC AC-3' 

5*.GAA CAG AAC AGT GGA GCA TC-3' 
5'-TAG GAG GCA GAG GAT GGT TC-S* 
5%CCC CAC TCT TAG CCA TTG TA-3* 
S'-TGG AGA TGT GCC ATA GAG GT-3' 
5'-TTC AAG TGG TTG CCT CTG GC-3* 
5*-ATG CTT TAT CCA GAG AAA AG'3' 
5*-CAA ACT TTC CAC AGT ATC GTT C-3* 
5'.CCA AAT GCT GGA GAC AGA GAG AA-3' 

5'-TTC TCA CAA AGT CAC CAC AT-3* 
5'-GGC CTC CTG GAA TAA TTC TC-3' 
V-Cn GTT CAT CTG CCT TGT GC-3* 
5'-ATC AAT GGA AAA ATG GGT AA-3' 
5'-ACT GGG GAA CAT GGT GGG GT-3' 
5*-Trr ATG CGA GCG TAT GGA TA-3' 
5*-TCC TCA AAA TGA AGA ACA CA-3* 
5'-CCTGGA AAA ATG GCT CAC C-3' 



5'-AGG TGG GTG GAT AAC TTG KQ-V 
5'-GTG GGC CAC ATT AGG AAC AG-3' 
S'-TGG GCG ATT TGT TCA TTG TG-3' 
5'-TGG AAG GAC GGG AAA TAA TA-3' 
5*-GCA ACC ATG GAG AGT CTG GA-3' 
S'-GAT TAA TGA TAG TGC TAT CC-3 ' 
5'-GAG CAG GCA CTT GTT AGA TG-3* 
5'.GGA ATA TGT TTT TAT TAG CTT GT-3 ' 

S'^QQC ATA CGA GAA AAT ACT GT-3* 
5*-CAC CAG CCC CAT TCC TTA GC-3* 
5'-GAG ACA CAG AGC AAA TAG GT-3' 
5'-TCA GGA AAA CTG CCT GAG G-3' 
5 '-AGC AAC TTG CCC AGG CTA TGA-3' 
5 --CAT CAT TAA TTG GAT TGT GG-3* 
5-.GTT TCC TTG AGA AGA ATG GAG C-3* 
5'-ACC CCT CCC TCC CTC CAT CAC AC-3' 

5'-TAG GGA AAA TGA CAG GAA AA-3" 
5*-CAT TTT AAT GAA CAC CGC TC-S' 
5'.ACC TAA GCG ACT GCC TAA AC-3' 
5'-TAT CTT TCT CTG TCT GCC Tl-V 
5'- ATG ATG ATT GCC AAA GGG AA-3- 
S'-CAC CAC CAT TGA TCT GGA AG-3* 
5 '-AAA AGT CTA GTG TTG AGT GT-3' 
5'-GGA AAA TCA GTC TCT AGT TG-3' 

SUBSm\JTES^HEET{RULE26) 



wo 95/15400 



PCT/US94/13945 



12/50 

FIG, 7B-3 
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FIG. 7C-I 
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FIG. 7C'2 

GROUP 3 



A Primer 



B Primer 



5*-GAC TTC ACC ATC AAC GCC TG-3' 
5*-GAG CAG CAC CGT ACA AAT-3' 
5*-TAA CAT GAG CGA ATG GAC AA-3' 
5'-GCC CAG GAG GTT.GAG G-3' 
5'-GGT ATG GAA GTC ACC CAA CA-3' 
5'-CAC ACA GGC TCA CAT GCC-3* 
5'-TCA TGT CCC TCC TCC CAA AG-3* 
5*-GCT ACT CAG GCA TGA GCG-3' 



5'-CAG GAA AGT GGA TGT GAC GA-3' 
5'-AGC TCC GCT CCC TGT AAT-3' 
5'-CAA GGT TTC ACC ACA GTT CT-3' 
S'.AAG GCA GGC TTG AAT TAC AG.3' 
5'-CTC AAA ATG ACT GAT GGG GT.3' 
5'-GCT CCA GCG TCA TGG ACT-3' 
5'-GAG CAA GCA TCC AAA AAC GA-3' 
5'-GGT CAC TTG ACA TTC GTG G-3* 



5*-CAT TGC AAA CTC AGG AGA TA-3' 
S'-AAA CTG TGG TCC TGG CTGO* 
5 '-AAA TTC TAG ACA TCG CCT GTA A-3* 
5'-GAC ACA GGT AGG TTA GAA GGA TG.3' 
5*'CCA GNC TCG GTA TGT TTT TAC TA-3* 
5'-AAA AAC GTA CTG CCA CAT TC-3' 
5'-AGC CAG CAT TAC CTC TGN TAC C-3' 
S'-TTA GCA AAT CCC AAG CAA TA.3' 



S^TAA CAG AGG CAT GAA AAC CA-3' 
5 '-AAA GTA GAG TCC TGG CCT GA-3- 
5*-GGT ACC ATC ACC ACA ATC AA-3' 
5*-TGT err GGT GAA TTG ACC CT-3- 
S'-CTG AAA CCT CTG TCC AAG CC-3' 
5 '-ACT TGT AGG CCT GTT CTG AG-3' 
5'-GAT CAC AGA TAT TGG CCC ATA G-3' 
5'-GTG ATG GTG GTA AAG GCA GA-3' 



5'.GGT GCC AGA CTA TGC AGA CC.3' 
5'-GGC TGT GGG TGT TTC TCC TA-3' 
5'-GAT CGC CTA TGA CCT CCT TG-'3 
S'-TTA ATA AAA ATA CCC CCA CC-3' 
5*.GCG CTC TTG GTA TAT GGT ACA Q-V 
5'-GAA TGT GAA AGG CTG TGC-3* 
5'-TGG CCT GAA TAG ACC ATA AAA A-3' 
5'-CAA CAC CCA AAC AGA TGA CC-3' 

suBSTinm 



5'-TAT GCT GAT TTA GGG AGC CC-3' 
S'-AGC TCT CAT GNC TTT ACA TTC T-3' 
5'-GCT GTC TGT GAG AGT TCG CA-3' 
5'-GGA AAT AGG TGT GAA CAA AA-3' 
5'-TGT GGG CAA CGT CAC TC.3' 
5'-AAA ATT ACA AAG AAG ACC-3 ' 
5'-GCC TGG GTG ACA AAG CA-3' 
5'-AGT CTT TCA TGG CCA CTG TG-3 ' 
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FIG. 70-3 
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FIG, 7D-2 

GROUP 4 

A PrifTicr 



5'-GAG GCA GGA GAA TCA CTT-3' 
5'-AGA TGA GGG GTA ATG TTG GA-3' 
5'-TTC GCT CTT TGA TAG GC-3' 
5'-CCC CTT GGA AAA TCA CTG-S* 
5'-CCT AAG TAG GCA GTT GGT AT-3 ' 
5'.AAC TTA CAC ATT TGG CCC TG-3* 
5'-AAC TGC AAC ATT GAA ATG GC-3' 
5*-TGG AAA OTA TGT ATC TTG GAG G-3- 

5'-CAT ATG CAT ACC ACA CAC-3' 
5*-AGC TCA GAG ACA CCT CTC CA-3' 
5'-TCA GCC TGA GTT TTC TFT AT-3* 
5'-GGT CTG ATG AAA ATG TTC TCA AGC-3' 
5'-AAC GTC TGC TCG TCA GAG TC-3' 
5'-GCC TTG GGG GTA AAT ACT CT-3* 
5'-Trr TCT TTT TTG CAG TTT ATC C-3' 
5*-ATC TTC CAA AAA TGT CAT-3* 

y-GQC CAG GCT TTG TTC AGA-3' 
5'-TTr AGC CTG AAA ATA CAC GC-3* 
5'.TGC ACA TTA AAG GAA CAG GT-3' 
5*-GAT CTG ATT ACT ATT GTC TGC TTG A-3' 
5'-AAA TGT GAG TAG AAG GGA TAG GTT-3' 
5'-GAG TGG CGG TGA GAA GGT AT-3' 
S'-TGG AAT TTC TCC ATG TTG AG-3* 
5*-GAA AAG AAT GCT GGA TAG-3* 

SUBSHTUTE SHEET (RULE %\ 



PCT/US94/13945 



17/50 

FIG. 7D-3 



B Primer 



5-.ATG GTT GTA GAT GAG ACT GG-S* 
S'.AAG CAT err AAT GGA TGG AAA-3' 
5*-ATT TCA TIT GTA ATT TAC TAG CAG-S" 
5'-CCA TGA ATA AGC CTT GCC-3' 
5'-CAC AGC AGG GGT TCA TFT TT-S' 
5*-TCA ATC TGT GGA GTC ATT GG-3" 
5'-GGG ACC ATA GTT CTT GGT GA-3' 
5*-GCN GGC TTT AGG GTG GO' 

5'-AAT CTT ATT GCT GTC TCA-3' 
5'-CTG TAT TAG GAT ACT TGG CTA TTG A-3* 
5'-CAA GGA GCA GGA AGA ACA GC-3' 
5*-TAG ACT GGG TTG TTA GGG ACT CTC.3' 
5'-CGA CTA CGT GCT GGC TAC TT-3' 
5'-GGA ATT ACA GGC CAC TGC TC-3' 
5'-CAC TTC AGT GCC TTC TTG AGA-3' 
5--CAT AAT AGG AGA ATA AGA-3- 

5'.CAG GGT CTA TGA TAC GCT Tr-3' 
5'-GCT TTG CTC CTA GAG TCC AG-3* 
5*-CAT AAT TTG CTG CTT TGG AT-3* 
5'-GCT TTA TAG GAG GTA TCT TTN TGT G-3* 
5'-TAA AAA AGN CCG ACT AGA CC-3* 
5'-AGC CAT TGC TAT CTT TGA GG-3' 
5*-AAG AGC TAT GAA AAG AGT TAA AGG A-3* 
5'-CCA GTT TIT ATG GAC GGG GT-3* 
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FIG. 7D'4 
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FIG. 7E'I 
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FIG. 7E-2 

GROUPS 

A Primer 



5*-AGG TCA TTG AGG TTT ATA TTC CCA-3* 
5*-ATC AGG AGA TGT TGC CTT GC-3' 
5*-AGG CAT ACT AGG CCG TAT T-3' 
5'-CAG ACA ATG GCT TCC AAA AGT A-3' 
5'-CCT GAA GGG TGT AAT TFT CA-3 ' 
5'-TGA TTG GAG GTG GTA GAG GT-S* 
5*-ATA ATA TCC TTT GAT CCT TTC GCT A-3* 
S'-nC CTC ATT TAG CTG CAC TAA G-3' 

5'-CAC CAT CTG TGT GGT ATT GGO' 
S'-TIQ TGC ACT CGT TAT GAG AA-3 ' 
5*-AAC TAA GAC ACA CAA CCC CG-3* 
5'-CTG CTG GAA CTT AAA AGT GC-S* 
5'-CAA CAG ATC TCC CAA GGT ACS' 
5*-AGG CTG TCT TGG CAG AAA T-3 ' 
5'-GAG GGC TGT TGA CCC AC-3 * 
5'-TCG GTA AAC ATT CAT CCA GA-3' 

5*-AAA CAA AAT AGC CTT CAA AA-3 ' 
f-TAG GCC CAA GGA ATT NAA AA-3' 
5'-AAA ATG ACT TCT TTG GGT GGG C-3' 
S'-TTC GCT GAG ATC ATG CCA C.3* 
S'-AGT GTT TTG AAG GTT GTA GGT TAA T-3 * 
5'-ATC TTG GAT TTA GGG TTG GC-3 • 
5*-TGT GTC ATT ACG CTT TTC ATC-3' 
5*-TGC ATT GTT GTC ATG CCT-3' 
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FIG, 7E'3 



B Primer 



5*-GAA CCC TAG GAA GTG AAA TAG AAA A-S' 
5'-CAG GGC TAT GAT TGG ATG TC-S* 
S'-TIZ CCA TCA GCG TCT TC-3' 
5*.CAA ACT TAG GGT TGT TCC TCA C-3* 
5'-TGA GAA GGT GTG TTA GGG TG-3' 
5'-AGC TAT CAT GTA GAA AAG CAG CA-3' 
5'-AAA TTT GGT TAT TTT TAA GCA AAC T-3* 
S'.TTG CTA AAC CTT GGG TGT GT-3' 



5'.GAC CTA TTT TGG TTA ACA ATT TAG A-3' 
5VCTG ATG GAG GTT AAG GCA AG-3' 
5'-CCA ATT CAG TGG CAT CTA TG-3* 
5'-AGA AAT GAG ATA TTG TTT TGG C-3* 
5'-CTC ATA ACT CAA AAC CTC TG-3* 
5'-GAT GTA ATC CTG TGC TAT GGC.3- 
5' -TTG CCT GGA AAC CTG GTA-3' 
S'.TGT CAA AAT GGA CCA ATC AG-3* 



5*.GCC TGG TAA GTT GAT AGT GT-S' 
5'-TCA TCA TCA CCA CAA ATG CT-3' 
5'-GTG GGT AGC AAC ACT GTG GC-3' 
5*-AGA CCT TTA GGT TGT TCA TGC TG-3* 
S'-ATA TCT TTC AGG GGA GCA GG-3' 
5'-GGC TCT GCT CCA TCT TCA TA-3 ' 
. 5*-TCA AAT GGT TCA GGA GAA AGA-3' 
^. 5'-TAA AGT CTC CAT CTT CGA TTG T.3 ' 
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FIG. 7E'4 



Annealing Labeled 
Temp. Primer 



68« F 

66« R 
62» F 

68» F 

6r F 

60» F 

64" F 

60« R 

66» R 

66« F 

60» R 

. 69" F 

66« R 

58» F 

60» F 

69" F 

68" F 

68" F 

62" F 

62° R 

66" F 
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FIG. 7F-I 
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FIG. 7F'2 

GROUP 6 

A Primer 



5'-TCA CCC CTA ATA CCC AAA AC-3* 
5*-AGT GGA CAG TTG GTA TCT CA-3' 
5'-ACT GGC CTG GCA GAG TCT-3* 
*5'-CAC AAT CAT ATG TNC CAA TT-3* 
5'-CAG TAG GCA GGG GTG G^3' 
5*-AAT TCA CAA GAC ACA ATC TCA G-3* 
5'-AGC TGA CTT TAT GCT GTT CCT-3' 
S'-CAA CAT ACT GCC TCA AAA-3' 

5'-TTC GGC CAA AAA CAG AGT CC-3' 
5'-AGT CAC CTT CTC TGT CTC CA-3* 
S'.AAC ATC TTA GGG CAT CCT G-3' 
5'- ATC TIT TAT TGT GGG GTG CT-3* 
5'-CTG GGC AAC AAG AGT GAA AT-3' 
5'-TGG AAA TAG AAT CCA GGC Tl-V 
5*-GCA ACT TTT CTG TCA ATC CA-3' 
5'-GCT ATT CCC ACA AAG GCA-3* 

5*-ATC ATG GGA AGT GCG TGG-3' 
5*.CTT TCC TGC CAA CCT CTT TC-3' 
5*-GGG CAC AGG CAT GTG T-3' 
5'.GAG AAC TAA TCC CTT CTG GC.3' 
5'-TCC CTA CGT TGC ATT TTA-3' 
S'-AAA TCG CTA GAA AAT GTC CA-3' 
5'-TTT TGG AAT TTC TAG CCT CC-3' 
5*-CAA CAG GTC CAG GCT ATG TC-3' 
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FIG. 7F-3 



B Primer 



5'-AAT ATG AAG GGA TGT TGA AT-3' 
S'-TGT GAT CAG CCC AGG AAG AG-3' 
5'.CAG CCA TTC GAG AGG TGT-3' 
5'-ATr AAA TGT GCA TAC GCA AA-3' 
5'-GGG TGT GTC TGT GTG ACA AC-3' 
5'-AGA ACT AAA GTT GCC TGT TON TGT K-V 
S'-TTT TCC ATG CCC TTC TAT CA-3* 
5'-TAC ACA AAA AGG AGG TCA Tr-3* 

5'-TGA GAA CTT CCA CAT AGC AG-3* 
5*-AGG CCT CAT TCA AAA TCT GT-3' 
5'-AAT GAT TTA AAA TAG ATT AGG AGC A-3' 
5'-TGC CCA GAC TTC TCA CCT.3' 
5*-CAA ATT CCA CAA AGC CGT-3* 
5*-TCT ATC GTT AAC TTT ATT GAT TCA 0-3' 
5'-ACC AAA CTT CAA ATT TTC GG-3' 
%'-QQZ QGA TCA TTG AGT GC-3' 

5'-TAA TTA GTT GCT GGT TTG AA-3* 
S*'TTG GGT TCA AGC GAT TCT CC-3* 
5'-GGC TGC ATT CTG AAA GGT TA-3* 
5*-AGC TTC ATA AAG AGT CTG GAA AAT-3' 
S'-TAC CCA GCC AAA CTA TTAO' 
S'-TCA CAC CTG GGA ATT AGA AG.3- 
S'-TGA AAC CCA CAG ATA TTG GG-3* 
5*-TAT CCA TAC ACA CCA TGC CA.3 ' 

SUBSTITUTE SHEET {RULE 2-?} 



wo 95/15400 



PCT/DS94/13945 



26/50 

FIG. 7F'4 

Annealing Labeled 
Temp. Primer 
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FIG, 7G-I 
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FIG, 7G'2 

GROUP? 

A Primer 



5*.GCT AGG ATT ACA GGC ACA T-3 ' 
5'-TGC TGC ACA TCT TAG GGA GT-3' 
5'-CTT TGC AGA ACC CAT GAT TAT GA-3' 
y-AAT TCT GAA GAG GCA AAT CTA A-3 ' 
5'-AAC AAT TGG GAA ATG GCT TA-3 ' 
S'-TTA TGG CAG CCC AAA TGG ACT A-3' 
5'-CAG GCA CAC GCA TAC AC-3* 
S'-AGG TTG ATA GAC CAT GGA GAC A.3* 

5'-CTr ACT GTG TTG CCC AAG GT-3- 
5'-TGT AAA GTT TTG TAC ATG GTG TAA T-3' 
5'-TAG CCA TGA TAG GAA ATC AAC C-3* 
5*-GTT TAC GCC TCA TGG ATT TA-3' 
5'-GAG AGG TGG TTT TCA GTG GT-3' 
S^'GQG CAA CAC AGT GAG ACT CT'3 ' 
S'-TAT TGG ATA CTT GAA TCT GCT 0-3' 
5'-CTG ATA ATA AAA CCA GGA AGA CAC-3* 

S'-TTC TGG AAA TGG ATA CTG GT-3' 
5'-ATG GGA GAC GTA ATA CAC CC-S' 
5'-AGA GCA AGT CCC TGC C-3' 
5'-TCA AAG ACC CAT ATC AAC CA-3' 
5'-ATT TCC TGA GGT CTA AAG CAC CC-3' 
5'-AGC TTC TAT CCA ACA GGG GC-3' 
5'-AAC AGG CTT GAA AGT CTC TGT C-3' 
S'-TTC AGG GTC TTT TGA AGA GG-3' 
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FIG. 76-3 



B Primer 



S'-AGG CTC CTA CTA CCG TCA C-3' 
S'-ACA GCG CTC AGA AAT CAT ATA A-3' 
S'-ATT GCC TTG GAG GGC G-3* 
5'-AGG AAA ATA TAC ACA ACC CAA G-3* 
S'-TAG GTF GTG GTG GGT GTT AC.3' 
5'-GCA GAA TGT TGC CCA AAA CTC A-3* 
5*-ACT TCA GGA ATA GCC TTT ACC-3' 
S'-TTT TAT TGT TAT GTG GCT TTC A-3' 

S'-AGC TCT ATG ATT CAT TTC AAG TFT G-3' 
5'-TCC TAA CAT TCT GCT ACC CA-3' 
5'-GAG ATC GTG CAG CAC TTG T-3* 
5'-GGG CAC ACA GTC CCA A-3' 
5*-TCA GGG ATA GTT GGT GGG TA.3- 
5'-TGG GAT AGA AGC AAC ACA GA-3* 
5'-TGC ATC ACC TCA CAT AGG TrA-3' 
5'-TATTGG CCT GAA GTG GTG-3* 

S'-TTT GGA TGC ACA GGA AGT TG-3' 
5'-ATG CTG CTG GTC TGA GG-3' 
5'.CAG CCT CGG AGA AAC G-3' 
5*.GTG CTG AAA AGC GAC ACT TA-3' 
S'-TTA GGC CCA GTC CAC ACT CAA G-3* 
5'-ACC AGA ATG TGA ACG ACC CT-3' 
5'-GCC TAT TTG ATA ATG CTG TAC G-3' 
5*-AGA AGG CAT TAA ATT TTG CA-3 ' 

SUBSTITUTE SHEET (RULE 26) 



wo 95/15400 



PCrAJS94/13945 



30/50 

FIG. 7G''4 

Annealing Labeled 
Temp. Primer 
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FIG. 7H-I 



Marker Alleles (bp) Hetero2ygosity Chromosome 



SETA 

DUS914 
D11S910 
DI7S784 
D22S274 
D19S216 
D21S259 
D20S103 



(275-285) 
(249-261) 
(226-238) 
(202-214) 
(179-191) 
(117-131) 
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D11S903 



(254-288) 
(224-244) 
(195-213) 
(176-181) 
(145-163) 
(118-134) 
(99-109) 
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82% 
70% 
81% 
75% 
75% 
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12 
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10 
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(144-160) 
(116-130) 
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69% 
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77% 
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FIG, 7H-2 

GROUPS 

A Primer 



5'-ATC TCA TGG GAG TAG CGT TG-3* 
5*-AGC m GCA GAG AAG GCA AG-S ' 
5'.GAG TCT OCT AAA TGC TGG GG-3' 
5'-GTC CAG GAG GTT GAT GC-3' 
5'-TCT TGT GAG TCT AAC TCC GC-S' 
5'-AGA ATG TGG TCT CAC AAG CC-3' 
5'-GTT CAT AGA GGG ACA AGA CAC AGT-3' 

5'-ATT TGA GAG CAG CGT GTT TT-3' 

5'-GGC ACT TGT AAT CCC CG-3* 

5*-CAA AAA AAT GTT TTA CTA AGC AGG-3' 

S'-TTC ACA ACA GCC AAT GGT AG-3' 

5'-CCC GGC TGT GAA TAT ACT TAA TGC-3' 

5'-AAC TGG TTT TGG TAG TGA GA-3* 

5'- AAC ACT TCG ATG TTC CTT CC-3' 

5'-CCT CAA ACC GGA CAA CTA TTT-3* 
5'-CAA AAA GGC AGA ATG CAG TA-3' 
5*-ATT GGG TTT ACT TGT GCC Tr.3* 
5*-CCT CCA ATC TGC ACC TGA CT-3' 
5'-GCT CCC GGC TGG TTT T.3* 
5'-TTN CAA CAT AGG TTA TAG GCG-3- 
S'-TGT TGG AGT TAA TGT GCC AT-3* 
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FIG, 7H-3 



B Primer 



Annealing 
Temp. 



Labeled 
Primer 



5'-GAC CCA CAT CAC CAT TAC TG-3' 
5'-TCC CTG CTC ATA ACT CAG CC-3' 
5'-AGC TCC TGC ACA GTT CTT AAA TA-3' 
5'-AGT GCC CAT TTC TCA AAA TA-S* 
5'-GGC CCA TGT CTI TTT TAG GT-3* 
5'-AGG GAA TGT CAA TGA AAA CC-3' 
5'-CCA TGA TGT TIG GTT AAT CAC A-3* 

5'-CCA TTA TGG GGA GTA GGG GT.3' 
5*.TGA GCC ACT GCA CCT G-3' 
5'-AGG CAT GAC TCA CCG C-3' 
5'-TTC TCA AGG TTC GTC CAT GT-3* 
V-ZCC AAC AGC AAT GGG AAG TT-3* 
5'-GAG GTG CCC GCT AGT A-3' 
5'-AGC TGA GAG CGC ATG TAT AA-3' 

5'-CAG AGA GCA AGA TCC TAC CTC-3' 
S'.TCC AGA GTC AAA AAC ACA GG-3* 
5'-CGT GAT TTC ATT TCT TGC TG-3' 
5*-TAG GCT TTG TTC TGG GGT TC-3' 
5*-GCA GGA AAT CGC AGG AAC TT-3' 
5*-GGC CCA GTT CAT TFT CTA GC-3' 
5*-TCT TTG ACC CAG ACC TCT AA-3 ' 
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FIG. 71'/ 



Marker Alleles (bp) Heterozygosity Chromosome 
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FIG. 71-2 

GROUP 9 

A Primer 



5'-AAG TGA TCC ACC TGC CTT G-3* 
5'-CAA TTC TGT TCT AAG ATT ATT TTG G-3* 
S'-GGG GTT GAT TGA AGT TGG TT-S' 
S'-TAC TAA CCA AAA GAG TTG GGG-S' 
5'-AGC AGC AGC AGC CAT ATT GT-3' 
S'-TTT ACC TAA GGC TGG ATC TG-3' 
S'-TCG TGA GAN TAC TGC TIT GG-S' 

5'-AAA ACA CCT TAC CTA AAA CAG CA-3' 
5'-ATG TTC AGA AAG GCC ATG TCA TTT G-3* 
5*-TGC ACC ACA GCA TAC CAG TA.3' 
5'-CTT GGG GAC TGA ACC ATC TT-3' 
S'-TTT GTG ATG GTC TTT TAT AGG CAT A-3' 
5'-CCT CAA TGC ACA ACT CCT-3' 
5'-CTG ACGA CAG TTT CAG TAT CTC TAT C-3' 

5'-GCC TTC ACT AAG CAA TCT CTA AA-3' 

5'-ACT ACC GCC AGG CAC T3' 

5'.CTG TGG GAT TCC TTA GTG ATA C-3' 

5'-GAA GTA AAG CAA GTT CTA TCC ACG.3' 

5'-CTC GCG CTG GGT ACA GTT AT.3' 

S'-TTG ATT TGG AAG ATT TTC AC-3' 

5'-AAC ACA CAT ACA AAC ACA CGC AGA T-3' 
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FIG. 71-3 

Annealing LiEibeled 

B Primer Temp. Primer 

5'-GCC TCT GAG AAT TAG TGT CTG TC-3' 
S'-CTC TGG CTG AGG AGG C-3* 
5'-CAA GAG CCA TAC CCA TGA-3' 
5'-CTA TCA TTC AGA AAA TGT TGG C-S' 
5'-AGT CAG GCC CAC CCA ATT TA-3* 
5'-CAA AGT TGA CAC TGA TTA TAG CA-3* 
S'-TTT TGT CTA GCC ATG ATT GC-3' 

5'-AGA TGA TGG TGA GTC CTG AG-3* 
S'-TCC CTA ACG GAT ACA CAG CAA CAC-3' 
5'-AAT GAA CAG CAA AAA CTA AGG GA'3' 
5'"AGC TAC CAT AGG GCT GGA GG-3* 
5*-GGC TCA AAG TGT TTG CAC TG-3' 
5'-CTC AGA CCT GGG TCA AGA TA-3' 
5'-TTT CCA GAT TFA GGG GTG TAT 0-3' 



5'-ACA TGC TCT GAA TCA CCT GA-3' 
5*-CTA AGA TAT GAA AAC CTA AGG GA-3* 
5*-ATA TTC AGA CAA AAG CCA AGT TA-3' 
5'-TCT GTG TAC GTT GAA AAT CCC-3' 
5*-AGA TCA GAG GAG TGG GTT CC.3' 
5'-GGG GCA GAA TGG GTA T-3' 
S'-TFC CAG ACA GGA CAG CCT GC-3' 
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FIG. 7J'I 

*■ 
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FIG. 7J-3 

Annealing Labeled 

B Primer Temp. Primer 

5'-AGT GAA ACT CGG NCC CTA-3 ' 
5'-AAC AAA CTT GCT TAT GAG TGT TAG T-3' 
5'-AAA ACA TIT CCA TTA CCA CTG-S' 
5'.GCT GAA GGCTGT TCT ATG GA-3' 
5'-GCA GTT GGG TTA TTT CAA GTC-3' 
5*-CTA CGT ACA TGG CTG CAAO' 
5'-CCA TCT TGG TGT GAG GGC.3' 
5'-GCT GAG CAA GGC ATT GTT T-3' 



S'-ACTGAG GTC ATG CAA GAG GC-3' 

5'-GAG CAA GAC TGC ATC TCA AA-3' 

S'.GTG TCA GGT CGG GGT G.3' 

5'-ACG ATT TCT GGG AGA CTA TAT TGC-3' 

5'-TTG TCA CTG CTT TTC TCT GC-3 * 

5'-CCC CTG AAG ACC GTG A-3* 

5 '-CCA ACA CCT GAG TCA GCA TA-3* 



5*-ATG TAA CAA AAT GGA GTC GG-3* 
5'-TCC TAA TTC ACT GGG AAA AC-3* 
5'-ATT ACA GGC GTG ACA CAC C-3' 
S'-GTT TGC CTG GGG ATT GAT Tr-3' 
5'-ATA GAC TGT GTA CTG GGC ATT GA-3' 
S'-ATG AAG AAA TAT ATA CAG TGC CG-3' 
5'-CAT GCC TAG ACT CCT GAT CC-3' 
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FIG, 7K-I 



Maiker 


Alleles (bp) 


Heterozygosity 


Chromos 


SET A 








D5S408 


(247-299) 


73% 


5 


D9S180 


(220-265) 


63% 


9 






oZ/o 


J 


D1S304 


(168-206) 


60% 


1 


D6S344 


(139-159) 


72% 


6 


D12S76 


(112-124) 


71% 


12 


D10S219 


(89-103) 


76% 


10 


SET B 








D11S906 


(291-303) 


73% 


11 


D15S121 


(258-264) 


66% 


15 


D5S425 


(224-248) 


77% 


5 


D5S395 


(189-213) 


o f of 

81% 


D 


D13S217 


(160-174) 


67% 


13 


D2S206 


(123-151) 


79% 


2 


D6S263 


(90-114) 


81% 


6 


SETC 








D14S74 


(291-313) 


79% 


14 


D20S98 


(259-275) 


79% 


20 


D9S168 


(227-247) 


75% 


9 


D16S42I 


(206-212) , 


56% 


16 


D13SI73 


(166-178) 


82% 


13 


D8S261 


(128-148) 


77% 


8 



D9S178 . (93-99) 66% 

suBsnrure sheet (rule 26) 



9 
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FIG, 7K'2 

GROUP 11 

A Primer 



5*-ACA ACT TCC AAC CCT GAG AT-3' 
9'-CAG TGG TIT GGA ATC GAA CC-3 
5'-GGC CAG TTC AGT CAA GTG-3' 
5*-ACC CTT TIT CCT CCA ATC AT-3* 
5'.CTC CAG CCT GGG TCA CTA-3 ' 
S*'QQQ CTA CAT GAT GAG ACC CT-3* 
5'-TCT TTC TAG CAC CCC CC-3 ' 

5'-AGC TGG GCA CCG ATA GTA GT-3' 

5*.TTG TAT CAG GGA TTT GGT TK-V 

5*-CTC CAG CCT GCT GAC C-3* 

5*-GCA GAT GGA AAA CAC CAC TT-3; 

5*-ATG CTG GGA TCA CAG GC-3* 

5'-TTA AAA ATT AAG TAG GCT TIT GGT T-3' 

5*-CTr AAG GCA AAA TTC TIT TCA ACA C-3' 

5'-CCT GTA CCA CTA CCT GAG TTG AGT-3' 
5'-GAA CTT GCA TAA CCC GAA T-3' 
5'-GGT TTG TGG TCT TTG TAA GG-S' 
S'-ACA TGA ACC GAT TGG ACT GA-3' 
5'-CCC TGT TCC AGT AAT GAT GAC C-3' 
5'.TGC CAC TGT CTT GAA AAT CC-3' 

5'-GAA TAA AAC AGG GTT TGG G-3' 
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FIG, 7K-3 



B Primer 



Annealing 
Temp. 



Labeled 
Primer 



5'-ACT GTG CCT AGC CTT CAT TT-S ' 
5'-AGC TAT TIT TOG GGG CTG AG-3* 
5'-TGG TTC CAG CAT ATA GCG-3* 
5'-AGA AGC TGA AAG CTG AGT GGO' 
5'-CTA ATG CAT GAC AAT AAT ATT TCC A-3' 
5'-GCG GAG CTF CTT TTC TGT TG-3* 
5'-GCA GAG AAC CTA AAG CAT CC-3* 

5'-GCA CAG GCA AAG ANG AGG TA.3' 

S'-TGT TGT CGC TTC AGT ACA TA.3' 

5'-TCTTGGGCAAGCCATC-3' ' 
5*-ACC TGC TGC TGG AAG ATI AC-3' 
5'-AAC CTG GTG GAC TTT TGC T-3' 
5'-GTC CTC ATG TGT TTA TGC TGT-3' 
5*-CTC AAA GTA AGA CCA TAA AAT ACC A-3' 

S'-CTF TGG CTG CCC GAA A-3' 
5*-CAA GGG TAT GTT CCC CAA AA-3' 
S'-TGG TFF GTT TGT ATA ACT ATC AT TG-S* 
S'-CCG TTCC CTA TAT TTC CTG G-3' 
5'-GTC TCT GGC TGC TCT CAA GAC TAT.3' 
5'-TAT GGC CCA GCA ATG TGT AT-3* 



S'-TFT CTC TAA GAA CTT TGG GG-3' 
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FIG. 7L-I 



Marker Alleles (bp) Heterozygosity Chromosome 

SETA 



D19S209 
DI4S77 
D10S189 
D12S87 



(206-272) 
(203-251) 
(180-188) 
(142-168) 



77% 
92% 
72% 
79% 



19 
14 
10 
12 



D13S158 

SET B 



(99-1 13) 



81% 



13 



DI1S931 
D16S4I5 
D11S925 
D16S409 
D13S219 
D22S284 
SETC 



(251-267) 
(208-234) 
(173-199) 
(135-147) 
(117-127) 
(86-102) 



73% 
72% 
84% 
70% 
64% 
76% 



II 
16 
II 
16 
13 
22 



D13S157 (250-264) 72% 

D14S78 (211-233) 66% 

D13S168 (173-197) 76% 

D15S122 (143-159) 77% 

D18S70 (111-126) 83% 
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FIG. 7L-2 

GROUP 12 



A Primer 



r-TTC ATT CAC AAA TCN ATG GC-3* 
5'-GCG TGA GTC ACT GTG CC-3* 
5'-CAA A\G TAA CCA TTG AGC CC-3* 
S'-CAC TAG GTG ATG CTG GAC Kl-V 

5'-GTA CCC ACG GAG TGA AAG AA-3' 



5'.GAT TGC TTG AGC CCA G-3' 
5*-CCA GTA ATG TTA TGT AAG TCA ATG C-3* 
5'-AGA ACC AAG GTC GTA ACT CCT G.3* 
5'-TGA ATC TTA CAT CCC ATC CC-3' 
S'-AAG CAA ATA TGC AAA ATT GC-V 
5*-ATG GGT ATT TAA CTT CTC TAC ACA G-3* 



5'-AGC TGA GAA ATC ACA ACA GAG A-3' 
5'.GGC AGG GAT AAG TAT GTC CT-3' 
5'-GCC TAG CCC AGT GGT G-3* 
5'-GAT AAT CAT GCC CCC CA-3' 
5'-AAG GCT GAN CTC TAC CG-3' 
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FIG. 7L-3 



Annealing Labeled 
B Primer Temp. Primer 



5'-CTG GAG AGC ATA GAG GNA GA-3' 
5'-CAG ACA GAA ATT AAC GAG AGT TGA A3' 
5'-TTG ATA GAA GAA GCG ATA GAT CG-3' 
5*-CTG CAC AAA CAC TTG AAA CA-3' 

5*-GCT TIG ACA ATT TAG GAG CA-S' 



5 '-GAG AAA TAG TAT GTG TTT GCC-3* 
5*-TAG CCA CTG TAC CCC AGC.3* 
S'-TIA GAC CAT TAT GGG GGC AA-3' 
5*-AGT CAG TCT GTC CAG AGG TG-3' 
5'-TCC TTC TGT TTC TTG ACT TAA CA-3' 
5*-GCT CTC TTG AGG TCG TTA CA-3' 



5*-TGG AAA TTT GCT GAC AGT AGA t-3' 
5'-AAA GGT AAC ATC CAA GGG GT-3* 
5*-TGC TTG TGC CTA TGT TCT TG-3' 
5*-CCC AGT ATC TGG CAC GTA G-3 * 
5'-GGA ATG TCA AGA AGT ACC TAC CAT A-3 ' 
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FIG, 7M-I 



Marker Alleles (bp) Heterozygosity Chromosome 

SETA 

D13SI56 (272-286) 80% 13 

D19S226 (235-263) 84% 19 

D16S422 (188-212) 78% 16 

D18S65 (168-178) 71% 18 

D16S413 (131-149) 83% 16 

D20S95 (82-100) 83 % 20 
SETB 

D22S279 (249-258) 73 % 22 

D19S222 (233-241) 65% 19 

D6S281 (203-219) 67% 6 

D17S808 (147-167) 67% 17 



SETC 



D21S260 
D19S218 
D22S280 
D17S799 
D19S210 
D11S922 



(267-277) 
(240-256) 
(208-220) 
(186-200) 
(165-177) 
(88-138) 



51% 
60% 
81% 
68% 
73% 
92% 



21 
19 
22 
17 
19 
11 
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FIG. 7M-2 

GROUP 13 

A Primer 



S'-ATT AGC CCA GOT ATG GTG AC-3' 
5*.CCA GCA GAT TIT GGT GTT GTC TA-3' 
5'-CAG TGT AAC CTG GGG GC-3' 
S'-GAG GCA GGA AAT TGC AGT GT-3* 
S'-ACT CCA GCC CGA GTA A-3' 

5*-AAA GCA AGG CTT CGT CTT AA-3* 



5':GCG ATC CAG CCT GTG T-3' 

5'-GAA ATG TCC TAT TTG AAA CTG TGC-3* 

5*-CTG GTA GTG TCA GGC ATG GC-3' 

S'.ACC CTA GAC AGG ATG CCA-3* 



5'-AGC TGT TCA TGC TTC CAT CT-S* 
S'-TTT GCA TTT TCT GGA GTT TT-3* 
5*-GCT CCA GCC TAT CAG GAT G-3' 
S'-ATT GCC AGC CGT CAG TT-S" 
S'-TCA CAC TCA CTG GTC TCT CA-3' 
5*-GGG GCA TCT TTG GCT A-3 ' 
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FIG. 7M-3 



Annealing Labeled 
B Primer Temp. Primer 



5'-GCT GTG GTA TGA GTT ACT TAA ACA C-3' 
S^GGTCCA GGA TTT GAA CTA AAG CA-3' 
5'-CTr TCG ATT AGT TTA CCA GAA TGA G-S' 
5*-GCT GGT err ACT ATC TCA GGG Q-V 
5'-GGT CAC AGG TGG GTT C-3 ' 

5*-TTC NTC ATT TTA TTG TGT GCG-3' 



5*.TGT AAA TGG GGT AAG TGA TGC-3* 
5'-CTG TTG AAA TGT ATC CAG TAA ATC G-3* 
5'-CCT ATG TTT CAG GCA AAG GC-3 * 

5*-TGT GGG TTT TCT CAG GTT AT-3' 



5'-AGA GCC CAG AAT ATT GAC CC.3' 
5*-AAT GTC CCT AAA CAC ATG GA-3* 
5'-GAT TCC AGA TCA CAA AAC TGG T-3* 
5*-GAC CAG CAT ATC ATT ATA GAC AAG C-3* 
5'-GGT GTG CCT GTG TGT AAA AG-3* 
5'-TCC GGT TTG GTT CAG G-3* 
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FIG. 8 
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FIG. 10A 
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